Re: [Rdkit-discuss] Errors with RDKit

Carlos Faerman Tue, 23 Jan 2018 06:39:47 -0800

Hi Wandré and Axel,

Thank you both !


I ended up adding this to my code:


if not mol: continue


Which works well for me
Carlos

________________________________
From: Wandré <[email protected]>
Sent: Tuesday, January 23, 2018 4:29 AM
To: Carlos Faerman
Cc: [email protected]
Subject: Re: [Rdkit-discuss] Errors with RDKit

Hi Carlos,
Simmilar to Axel, in my code I use
if mol is None: return False (if you are using a function to read each SDF file)
if mol is None: continue (to force the next loop)

--
Wandré Nunes de Pinho Veloso
Professor Assistente - Unifei - Campus Avançado de Itabira-MG
Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e Inteligência 
Computacional - UNIFEI
Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG

2018-01-23 0:59 GMT-02:00 Carlos Faerman 
<[email protected]<mailto:[email protected]>>:

Hello,

My code is very simple:

suppl4 = Chem.SDMolSupplier("/Volumes/MyPassportForMac/chembl_23.sdf")

i = 0

for mol in suppl4:

     smile = Chem.MolToSmiles(mol,isomericSmiles=True)

     fingerpri = get_fp_rdkit(mol)

     namenmol = mol.GetProp("_Name")

     patron[namenmol]=fingerpri

     print (i,namenmol,smile)

     outfile.write('{} {} {}\n'.format(namenmol,smile,fingerpri))

     i = i+1

I know that a few molecules in Chembl_23.sdf have wrong valences.

Is there a way to skip these molecules when these errors are found?


671231 CHEMBL1254908 O=C(NCCN1CCC2(CC1)C(=O)NCN2c1cccc(Cl)c1)c1cc2cc(F)ccc2[nH]1

[21:18:16] Explicit valence for atom # 35 N, 5, is greater than permitted

[21:18:16] ERROR: Could not sanitize molecule ending on line 48940986

[21:18:16] ERROR: Explicit valence for atom # 35 N, 5, is greater than permitted

Traceback (most recent call last):

  File "calculate-fingerprints.py", line 23, in <module>

    smile = Chem.MolToSmiles(mol,isomericSmiles=True)

Boost.Python.ArgumentError: Python argument types in

    rdkit.Chem.rdmolfiles.MolToSmiles(NoneType)

did not match C++ signature:

    MolToSmiles(RDKit::ROMol mol, bool isomericSmiles=False, bool 
kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool 
allBondsExplicit=False, bool allHsExplicit=False)

 The "culprit" molecule seems to be

CHEMBL450200

-My question to this forum:

Please suggest a way to modify the code to skip the wrong molecule(s) instead 
of abruptly ending the run

Thank you,

Carlos Faerman

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Errors with RDKit

Reply via email to