Re: [Rdkit-discuss] SanitizeMol changing drawing

2017-12-14 Thread Jason Biggs
Greg,
That really helps!  The CIP rank property is the key I was looking for.
Now I can put in a check for to see if the property is defined before
calling prepareMolForDrawing, and if it isn't then call assignAtomCIPRanks
first.

I was having issues where molecules created from a SMILES would not
round-trip through my internal molecule representation (in Mathematica) and
back perfectly, and this fixes most of those issues.

Thanks,

Jason

On Wed, Dec 13, 2017 at 11:26 PM, Greg Landrum 
wrote:

> Hi Jason,
>
> This is a nice one.
>
> Here's what's going on:
> The depiction code (the piece that generates 2D coordinates) attempts to
> generate "canonical" coordinates : it tries to generate the same
> coordinates for a molecule no matter what the input atom ordering is.
> In order to do that it needs a canonical numbering of the atoms (or at
> least something approximating one).
> The current code uses the calculated CIP ranks of the atoms as this
> canonical ordering. These ranks are generated as part of the standard
> stereochemistry assignment that is done on molecule construction and are
> stored as computed properties on the atoms. If the CIP ranks are not there
> it more or less gives up and just uses the atomic number.
> The call to SanitizeMol() clears the computed properties on atoms, thus
> blowing out the CIP rank information that the depiction code uses.
>
> If you want to resolve this, you can call 
> Chem.AssignStereochemistry(m2,cleanIt=True,
> force=True) after you sanitize the molecule. Note that this can be a
> computationally expensive call, so you may not want to make a habit out of
> it.
>
> I'll create an issue to explore updating the depiction code and replacing
> the use of CIP ranks with the atom ranking generated by Nadine's
> canonicalization code
>
> -greg
>
>
> On Wed, Dec 13, 2017 at 10:38 PM, Jason Biggs 
> wrote:
>
>> using the recent release,
>>
>>
>> m = Chem.MolFromSmiles("N[C@@H](C)C(=O)O")
>> m2 = Chem.MolFromSmiles("N[C@@H](C)C(=O)O")
>> Chem.rdmolops.SanitizeMol(m2)
>>
>>
>>
>> The two molecules above seem identical - MolFromSmiles already performs a
>> sanitization so why wouldn't they be?  They produce the same pickle,
>>
>> pickle.dumps(m) == pickle.dumps(m2)
>>
>> True
>>
>>
>> So why do they get treated differently by the drawing code? The only way
>> to return m2 to its original state is to run AssignStereoChemistry with
>> force = True.  What variable is being thrown off by SanitizeMol?
>>
>> [image: Inline image 1]
>>
>> Jason Biggs
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SanitizeMol changing drawing

2017-12-13 Thread Greg Landrum
Hi Jason,

This is a nice one.

Here's what's going on:
The depiction code (the piece that generates 2D coordinates) attempts to
generate "canonical" coordinates : it tries to generate the same
coordinates for a molecule no matter what the input atom ordering is.
In order to do that it needs a canonical numbering of the atoms (or at
least something approximating one).
The current code uses the calculated CIP ranks of the atoms as this
canonical ordering. These ranks are generated as part of the standard
stereochemistry assignment that is done on molecule construction and are
stored as computed properties on the atoms. If the CIP ranks are not there
it more or less gives up and just uses the atomic number.
The call to SanitizeMol() clears the computed properties on atoms, thus
blowing out the CIP rank information that the depiction code uses.

If you want to resolve this, you can call
Chem.AssignStereochemistry(m2,cleanIt=True, force=True) after you sanitize
the molecule. Note that this can be a computationally expensive call, so
you may not want to make a habit out of it.

I'll create an issue to explore updating the depiction code and replacing
the use of CIP ranks with the atom ranking generated by Nadine's
canonicalization code

-greg


On Wed, Dec 13, 2017 at 10:38 PM, Jason Biggs  wrote:

> using the recent release,
>
>
> m = Chem.MolFromSmiles("N[C@@H](C)C(=O)O")
> m2 = Chem.MolFromSmiles("N[C@@H](C)C(=O)O")
> Chem.rdmolops.SanitizeMol(m2)
>
>
>
> The two molecules above seem identical - MolFromSmiles already performs a
> sanitization so why wouldn't they be?  They produce the same pickle,
>
> pickle.dumps(m) == pickle.dumps(m2)
>
> True
>
>
> So why do they get treated differently by the drawing code? The only way
> to return m2 to its original state is to run AssignStereoChemistry with
> force = True.  What variable is being thrown off by SanitizeMol?
>
> [image: Inline image 1]
>
> Jason Biggs
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SanitizeMol changing drawing

2017-12-13 Thread Jason Biggs
using the recent release,


m = Chem.MolFromSmiles("N[C@@H](C)C(=O)O")
m2 = Chem.MolFromSmiles("N[C@@H](C)C(=O)O")
Chem.rdmolops.SanitizeMol(m2)



The two molecules above seem identical - MolFromSmiles already performs a
sanitization so why wouldn't they be?  They produce the same pickle,

pickle.dumps(m) == pickle.dumps(m2)

True


So why do they get treated differently by the drawing code? The only way to
return m2 to its original state is to run AssignStereoChemistry with force
= True.  What variable is being thrown off by SanitizeMol?

[image: Inline image 1]

Jason Biggs
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss