My code is very simple:

suppl4 = Chem.SDMolSupplier("/Volumes/MyPassportForMac/chembl_23.sdf")

i = 0

for mol in suppl4:

     smile = Chem.MolToSmiles(mol,isomericSmiles=True)

     fingerpri = get_fp_rdkit(mol)

     namenmol = mol.GetProp("_Name")


     print (i,namenmol,smile)

     outfile.write('{} {} {}\n'.format(namenmol,smile,fingerpri))

     i = i+1

I know that a few molecules in Chembl_23.sdf have wrong valences.

Is there a way to skip these molecules when these errors are found?

671231 CHEMBL1254908 O=C(NCCN1CCC2(CC1)C(=O)NCN2c1cccc(Cl)c1)c1cc2cc(F)ccc2[nH]1

[21:18:16] Explicit valence for atom # 35 N, 5, is greater than permitted

[21:18:16] ERROR: Could not sanitize molecule ending on line 48940986

[21:18:16] ERROR: Explicit valence for atom # 35 N, 5, is greater than permitted

Traceback (most recent call last):

  File "calculate-fingerprints.py", line 23, in <module>

    smile = Chem.MolToSmiles(mol,isomericSmiles=True)

Boost.Python.ArgumentError: Python argument types in


did not match C++ signature:

    MolToSmiles(RDKit::ROMol mol, bool isomericSmiles=False, bool 
kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool 
allBondsExplicit=False, bool allHsExplicit=False)

 The "culprit" molecule seems to be


-My question to this forum:

Please suggest a way to modify the code to skip the wrong molecule(s) instead 
of abruptly ending the run

Thank you,

Carlos Faerman
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Rdkit-discuss mailing list

Reply via email to