Why don't you just add a simple test?

for m in suppl:
    clean_mol = remover.StripMol(m)
    if not clean_mol.GetNumAtoms():
      continue
    # all the atoms in this molecule have been removed - clean_mol is empty
    Chem.SanitizeMol(clean_mol) # I'd wish you would fail, but this works
    wr.write(clean_mol) # we write a molecule with no atom block

This seems to be pretty clean to me. If you'd want SantizeMol to fail then 
you'd have to add try statements ...

My 2 pence
Nik

From: JP [mailto:jeanpaul.ebe...@inhibox.com]
Sent: Monday, January 30, 2012 1:04 PM
To: rdkit-discuss@lists.sourceforge.net
Subject: [Rdkit-discuss] Molecule with no atoms, so is it valid?

Hi there, using 2011.12.01

I guess my question is a semantic one, but I give you a practical example later 
on just in case.

Q1.  "Is a molecule without atoms valid (should it pass the SanitizeMol)?"
Q2.  "Is there a way to check for this condition?"  The fact that I will have 
to litter my code with things like

if num_of_atoms > 0:
    sdf_writer.write(mol)

Makes me think that this is something which should be handled internally...

Q3.  "What does the SDF spec say about molecules with no atom blocks?"




A code walkthrough (all code attached) - just to show you a common use case:

#!/usr/bin/env python

from rdkit import Chem
from rdkit.Chem.SaltRemover import SaltRemover

# let us make sure sanitization is explicitely on
suppl = Chem.ForwardSDMolSupplier('salty.sdf', sanitize=True)
# let us remove some salts
remover = SaltRemover()

wr = Chem.SDWriter('out.sdf')
for m in suppl:
    clean_mol = remover.StripMol(m)
    # all the atoms in this molecule have been removed - clean_mol is empty
    Chem.SanitizeMol(clean_mol) # I'd wish you would fail, but this works
    wr.write(clean_mol) # we write a molecule with no atom block

wr.flush()
wr.close()

suppl = Chem.ForwardSDMolSupplier('out.sdf', sanitize=True)
for m in suppl:
    # the above gives an ERROR, but we still have an instance
    # but if this is "allowable" and passes the sanity checks why should I get 
an error?
    # something is not quite consistent here
    print m

[11:50:56] ERROR: on line 5 Problems encountered parsing data fields
[11:50:56] ERROR: moving to the begining of the next molecule
<rdkit.Chem.rdchem.Mol object at 0x23c5c90>

Comments?

-
Jean-Paul Ebejer
Early Stage Researcher
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to