On Dec 8, 2012, at 7:01 AM, Greg Landrum wrote:
> It's a pretty small API change, but there's a huge amount of code that needs 
> to be changed in the back and lots and lots of testing that has to be done, 
> so this is going to take a while.

Speaking of hydrogens, I came across a strange query (part of the
BindingDB_structure set from my Structure Query Collection).

     C1N[H]OC[H]1

Because of RDKit's sanitization, the '[H]' gets absorbed into the atoms.
This breaks a bond, so what started as a ring with poor chemistry ends
up as two linear pieces with sane chemistry.

>>> mol = Chem.MolFromSmiles("C1N[H]OC[H]1")
>>> Chem.MolToSmiles(mol)
'CN.OC'
>>> 

The above query is nonsensical, but I don't think sanitization should
modify the topology of the input.

Divalent hydrogen does exist in a very small number of real records.
I once came across a couple of structure with a 4-membered boron/hydrogen
ring, like this:

>>> mol = Chem.MolFromSmiles("[B]1[H][B][H]1")
>>> Chem.MolToSmiles(mol)
'[BH].[BH]'
>>> 

I am unable to find a public record with that core, to act as a
more realistic test case.



                                Andrew
                                da...@dalkescientific.com



------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Rdkit-devel mailing list
Rdkit-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-devel

Reply via email to