This molecule with no atoms being valid is a questionable design decision
(not your fault of course, you are just implementing a spec).
I think that the smiles writer should not write an empty molecule (you
could change the method signature to take yet another param
"empties=True/False" but I do not think this is correct either). And IMHO
the parser should not read one either.
What happens with the following very organic smiles file?
CCCCC JP1
CCC JP2
CCCC JP4
The generator is going to give me an empty (third) molecule? So I have to
always dirty my code with m.GetNumAtoms() > 0 in that loop. What is the
empty molecule is at the end of the file (ouch)?
Also what if you have an identifier for the empty molecule. So replace the
empty third line with " JP3" ?
What is the ForwardSDMolSupplier/SDWriter going to do with this empty mol?
Does it just write name and properties?
I don't want to be controversial or anything, but I disagree with almost
everyone else about this, in that we should use common sense and not stick
to the spec in this case.
Does someone have a use-case for an empty molecule ? At least I can
understand what people are using this for
Also having the writer do one thing, and the parser do another means that
RDKit cannot read the files/molecules it generates. I think this is a big
inconsistency, and not one deserving of this excellent bit of software.
Cheers,
-
Jean-Paul Ebejer
Early Stage Researcher
On 23 May 2012 10:42, Greg Landrum <greg.land...@gmail.com> wrote:
> Dear all,
>
> The svn version of the RDKit now behaves like this:
> In [2]: m = Chem.MolFromMolFile('empty.mol')
>
> In [3]: m.GetNumAtoms()
> Out[3]: 0
>
> Notice that there are no longer error messages.
>
> There is, however, the following wart:
> In [6]: Chem.MolToSmiles(m)
> Out[6]: ''
>
> In [7]: m2 = Chem.MolFromSmiles('')
> SMILES Parse Error: syntax error for input:
>
> In [8]: m2 is None
> Out[8]: True
>
> The SMILES writer happily generates an empty string for the molecule
> with no atoms, but the SMILES parser generates an error.
>
> The behavior of the parser is, I believe, consistent with Daylight.
> The question is what the writer should do. I see two choices:
> 1) As is: Writer generates an empty string, but the parser generates an
> error
> 2) Change MolToSmiles so that it generates an error if the molecule
> has no atoms.
>
> I prefer the status quo (choice 1) because I don't really like the
> idea that a valid molecule would lead to an error in the writer.
>
> -greg
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss