On Fri, Jun 29, 2012 at 7:45 AM, Greg Landrum <greg.land...@gmail.com> wrote:
>
> The test I devised was the following :
>
> 1) Read a molecule from the sdf
> 2) generate canonical smiles csmi
> 3) Parse csmi to give a new molecule
> 4) generate a new canonical smiles and make sure it matches csmi
> 5) Pick 5 random atoms in the molecule and, for each one:
>    5a) generate a non-canonical smiles rooted at that atom
>    5b) parse that non-canonical smiles to give a new molecule
>    5c) generate a new canonical smiles from that and make sure it matches csmi
>

<snip>

> If anyone has recommendations for alternate test methodologies or test
> sets, please let me know. These tests aren't exactly super fast, so
> I'd like to avoid something like "just run the {pubchem, emolecules,
> full ZINC} set", but if people are convinced that's necessary, I can
> set it up and run it.

Yesterday I successfully ran the same test across 500K compounds
randomly selected from the ZINC "Drugs Now" set
(http://zinc.docking.org/subsets/drugs-now).

I also created a second testing approach:

1) Read a molecule m1 from the sdf
2) generate canonical smiles csmi
3) Parse csmi to give a new molecule m2
4) make sure all chiral centers in m1 and m2 have the same CIP code
and that all double bonds where stereochemistry is indicated have the
same stereochemistry.
5) Pick 5 random atoms in the molecule and, for each one:
   5a) generate a non-canonical smiles rooted at that atom
   5b) parse that non-canonical smiles to give a new molecule m3
   5c) make sure all chiral centers in m1 and m3 have the same CIP
code and that all double bonds where stereochemistry is indicated have
the same stereochemistry.


This ran without failures across the 500K ZINC compounds.

I've got some confidence now that the code is correct, so I'm merging
it onto the trunk this morning.

-greg

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-devel mailing list
Rdkit-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-devel

Reply via email to