Why don't you just add a simple test?
for m in suppl:
clean_mol = remover.StripMol(m)
if not clean_mol.GetNumAtoms():
continue
# all the atoms in this molecule have been removed - clean_mol is empty
Chem.SanitizeMol(clean_mol) # I'd wish you would fail, but this works
wr.write(clean_mol) # we write a molecule with no atom block
This seems to be pretty clean to me. If you'd want SantizeMol to fail then
you'd have to add try statements ...
My 2 pence
Nik
From: JP [mailto:jeanpaul.ebe...@inhibox.com]
Sent: Monday, January 30, 2012 1:04 PM
To: rdkit-discuss@lists.sourceforge.net
Subject: [Rdkit-discuss] Molecule with no atoms, so is it valid?
Hi there, using 2011.12.01
I guess my question is a semantic one, but I give you a practical example later
on just in case.
Q1. "Is a molecule without atoms valid (should it pass the SanitizeMol)?"
Q2. "Is there a way to check for this condition?" The fact that I will have
to litter my code with things like
if num_of_atoms > 0:
sdf_writer.write(mol)
Makes me think that this is something which should be handled internally...
Q3. "What does the SDF spec say about molecules with no atom blocks?"
A code walkthrough (all code attached) - just to show you a common use case:
#!/usr/bin/env python
from rdkit import Chem
from rdkit.Chem.SaltRemover import SaltRemover
# let us make sure sanitization is explicitely on
suppl = Chem.ForwardSDMolSupplier('salty.sdf', sanitize=True)
# let us remove some salts
remover = SaltRemover()
wr = Chem.SDWriter('out.sdf')
for m in suppl:
clean_mol = remover.StripMol(m)
# all the atoms in this molecule have been removed - clean_mol is empty
Chem.SanitizeMol(clean_mol) # I'd wish you would fail, but this works
wr.write(clean_mol) # we write a molecule with no atom block
wr.flush()
wr.close()
suppl = Chem.ForwardSDMolSupplier('out.sdf', sanitize=True)
for m in suppl:
# the above gives an ERROR, but we still have an instance
# but if this is "allowable" and passes the sanity checks why should I get
an error?
# something is not quite consistent here
print m
[11:50:56] ERROR: on line 5 Problems encountered parsing data fields
[11:50:56] ERROR: moving to the begining of the next molecule
<rdkit.Chem.rdchem.Mol object at 0x23c5c90>
Comments?
-
Jean-Paul Ebejer
Early Stage Researcher
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss