Hi Andrew,
I also prefer #2. #1 is not quite sensible because many readers like
MolFromSmiles will return None on failure and it will be hard to distinguish
bad input from an empty one if we choose to do #1. Semantically, in many RDKit
use cases, None and Empty Mol are as different as a webpage not found (HTTP
404) and a blank web page.
Eddie
On Apr 28, 2012, at 3:15 PM, Andrew Dalke wrote:
It looks like this discussion is still unresolved. Greg's going to have a
pile of things on his desk (eDesk?) when he gets back from holidays. :)
On Jan 30, 2012, at 1:26 PM, JP wrote:
But then I will have to add the if not clean_mol.GetNumAtoms():
before/after replacing/editing molecule parts, after reading molecules,
before writing them etc. i.e. I'd need this statement in a lot of places.
This is why I asked if it should be considered a valid molecule - because if
these moves in SanitizeMol I wouldn't need any of that e.g. I can assume
that the molecule I have in hand, is valid and if I still wanted these
molecules (for some not so clear reason) I could just switch of sanitization
off, on the methods that allow it.
I just ran into this myself. I used SaltRemover on CHEMBL1644029, which is
potassium nitrate. The SaltRemover removed both ions, leaving me with an
empty structure.
I blithely wrote the de-salted structures to a file. It wasn't until I had
RDKit read the structures later, when trying to figure out the message
Problems encountered parsing data fields, that I realized that I had
de-salted a salt, leaving nothing.
from rdkit import Chem
from rdkit.Chem import SaltRemover
remover = SaltRemover.SaltRemover()
mol = Chem.MolFromSmiles([K+].[O-]N(=O)=O)
mol2 = remover.StripMol(mol)
Chem.SDWriter(/dev/stdout).write(mol2)
RDKit
0 0 0 0 0 0 0 0 0 0999 V2000
M END
writer = Chem.SDWriter(empty.sdf)
writer.write(mol2)
writer.close()
for mol in Chem.ForwardSDMolSupplier(empty.sdf):
... print I have, repr(mol)
...
[00:02:16] ERROR: on line 5 Problems encountered parsing data fields
[00:02:16] ERROR: moving to the begining of the next molecule
I have rdkit.Chem.rdchem.Mol object at 0x101e00b40
I agree with JP. Either:
1) this is an ERROR, in which case a) the reader should yield a None rather
than a molecule object, and b) the writer should refused to write it, or
2) this is not an ERROR, and the reader should be fixed so it doesn't log an
error message for this case. Since it's either valid syntax or it isn't, it
probably shouldn't generate any message.
Of these, I prefer #2.
Cheers,
Andrew
da...@dalkescientific.com
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss