Re: [Rdkit-discuss] Double Bond Stereochemistry in the RDKit

2018-12-04 Thread Kovas Palunas
Ok cool!  I did actually just run into an issue while doing some tests: 
https://github.com/rdkit/rdkit/issues/2183.  This issue brings up a question 
for me about where the smiles writer actually looks for stereochem info when it 
decides what to write.

Also, I ran into the same snag earlier too Dan!

- Kovas

From: Dan Nealschneider 
Date: Tuesday, December 4, 2018 at 10:05 AM
To: "col...@gmail.com" 
Cc: Kovas Palunas , rdkit discuss 

Subject: Re: [Rdkit-discuss] Double Bond Stereochemistry in the RDKit

I've done some in-memory translation of molecules to ROMols, and have used #2 
without major problems. I do remember needing to make sure that the stereoatoms 
are in the correct order - that is, that the first stereoatom is bonded to the 
beginAtom of the bond. In Python, this is something like:

bond = mol.GetBondBetweenAtoms(begin, end)
if bond.GetBeginAtomIdx() != begin:
 assert bond.GetBeginAtomIdx() == end
 stereoatom1, stereoatom2 = stereoatom2, stereoatom1
bond.SetStereoAtoms(stereoatom1, stereoatom2)
bond.SetStereo(stereo)

- dan nealschneider

(né wandschneider)

Senior Developer
Schrödinger, Inc
Portland, OR



On Tue, Dec 4, 2018 at 7:02 AM Brian Cole 
mailto:col...@gmail.com>> wrote:
Hi Kovas,

For your use-case #2 should suffice, "set STEREOCIS/STEREOTRANS tags + manually 
set stereo atoms". This is what the EnumerateStereoisomers code does: 
https://github.com/rdkit/rdkit/blob/master/rdkit/Chem/EnumerateStereoisomers.py#L38

As to what is the 'ground truth', that is a more difficult question that I fear 
the answer may be 'none of them'. STEREOCIS/STEREOTRANS are rather recent 
additions to the RDKit API, while we strived to make sure STEREOCIS/STEREOTRANS 
across the RDKit, there are probably looming bugs in untested parts of the 
RDKit that don't handle them properly. However, I think those other APIs should 
be fixed to handle them properly, so please do report any problems you spot 
into the github issue tracker.

Cheers,
Brian



On Mon, Dec 3, 2018 at 7:00 PM Kovas Palunas 
mailto:kovas.palu...@arzeda.com>> wrote:
Hi All,

I’m looking for a bit more clarity regarding double bond stereochem in RDKit.  
Currently, my understanding is that there are 3 ways to currently store this 
information:


  1.  STEREOE/STEREOZ tags + stereo atoms on either side of bond set by CIP 
ranks, as computed when calling MolFromSmiles to make a new molecule or 
AssignStereochemistry on an existing molecule
  2.  Manually set STEREOCIS/STEREOTRANS tags + manually set stereo atoms
  3.  ENDUPRIGHT/etc. single bond directionality tags, which are set when 
reading a molecule from smiles/inchi/mol file

Is one of these methods the “ground truth” that is looked for by RDKit 
functions that care about this info, like the substructure matching code or the 
SMILES writing code?

I am currently working on code that mutates molecules using a predetermined 
list of changes to be made to the molecule.  I’d like to be able to include 
bond stereochemistry changing/creation/destruction here, and was thinking of 
doing so using the STEREOCIS/STEREOTRANS tags (and also providing the reference 
stereo atoms).  Before I do this I want to make sure that molecules with these 
tags will be handled correctly by other RDKit functions downstream.  Would 
these tags be a good choice here?  Are there any caveats I should keep in mind 
as I work with this information?

Thanks!

- Kovas

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Double Bond Stereochemistry in the RDKit

2018-12-04 Thread Dan Nealschneider
I've done some in-memory translation of molecules to ROMols, and have used
#2 without major problems. I do remember needing to make sure that the
stereoatoms are in the correct order - that is, that the first stereoatom
is bonded to the beginAtom of the bond. In Python, this is something like:

bond = mol.GetBondBetweenAtoms(begin, end)
if bond.GetBeginAtomIdx() != begin:
 assert bond.GetBeginAtomIdx() == end
 stereoatom1, stereoatom2 = stereoatom2, stereoatom1
bond.SetStereoAtoms(stereoatom1, stereoatom2)
bond.SetStereo(stereo)

- dan nealschneider

(né wandschneider)

Senior Developer
Schr*ö*dinger, Inc
Portland, OR




On Tue, Dec 4, 2018 at 7:02 AM Brian Cole  wrote:

> Hi Kovas,
>
> For your use-case #2 should suffice, "set STEREOCIS/STEREOTRANS tags +
> manually set stereo atoms". This is what the EnumerateStereoisomers code
> does:
> https://github.com/rdkit/rdkit/blob/master/rdkit/Chem/EnumerateStereoisomers.py#L38
>
> As to what is the 'ground truth', that is a more difficult question that I
> fear the answer may be 'none of them'. STEREOCIS/STEREOTRANS are rather
> recent additions to the RDKit API, while we strived to make sure
> STEREOCIS/STEREOTRANS across the RDKit, there are probably looming bugs in
> untested parts of the RDKit that don't handle them properly. However, I
> think those other APIs should be fixed to handle them properly, so please
> do report any problems you spot into the github issue tracker.
>
> Cheers,
> Brian
>
>
>
> On Mon, Dec 3, 2018 at 7:00 PM Kovas Palunas 
> wrote:
>
>> Hi All,
>>
>>
>>
>> I’m looking for a bit more clarity regarding double bond stereochem in
>> RDKit.  Currently, my understanding is that there are 3 ways to currently
>> store this information:
>>
>>
>>
>>1. STEREOE/STEREOZ tags + stereo atoms on either side of bond set by
>>CIP ranks, as computed when calling MolFromSmiles to make a new molecule 
>> or
>>AssignStereochemistry on an existing molecule
>>2. Manually set STEREOCIS/STEREOTRANS tags + manually set stereo atoms
>>3. ENDUPRIGHT/etc. single bond directionality tags, which are set
>>when reading a molecule from smiles/inchi/mol file
>>
>>
>>
>> Is one of these methods the “ground truth” that is looked for by RDKit
>> functions that care about this info, like the substructure matching code or
>> the SMILES writing code?
>>
>>
>>
>> I am currently working on code that mutates molecules using a
>> predetermined list of changes to be made to the molecule.  I’d like to be
>> able to include bond stereochemistry changing/creation/destruction here,
>> and was thinking of doing so using the STEREOCIS/STEREOTRANS tags (and also
>> providing the reference stereo atoms).  Before I do this I want to make
>> sure that molecules with these tags will be handled correctly by other
>> RDKit functions downstream.  Would these tags be a good choice here?  Are
>> there any caveats I should keep in mind as I work with this information?
>>
>>
>>
>> Thanks!
>>
>>
>>
>> - Kovas
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bond tags in SVGs

2018-12-04 Thread Lukas Pravda
Hi Greg, 

 

that’s what I have been thinking, unlucky. Essentially, I want to color the 
molecule in web-browser with various annotations and make it interactive. For 
that part I’m converting it internally to the d3.js internal representation 
(https://d3js.org/) and connecting it to its environment. For most of the parts 
I’m just fine with the position of atoms in svg using the tag property.

 

What I wanted to avoid is to replicate rdkit svg drawing code in javascript so 
that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do 
instead is to use existing svg images and parse them into d3.js, so I know 
which paths belong to which bond.

 

At this point my only idea is to color bonds individually and based on the 
overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong 
to, which is a bit overkill in my view.

 

Lukas  

 

 

From: Greg Landrum 
Date: Tuesday, 4 December 2018 at 17:24
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Hi Lukas,

 

There's not currently a way to do this at the moment. The closest you can get 
is by calling AddMoleculeMetadata():

 

In [6]: d = Draw.MolDraw2DSVG(200,200)

 

In [8]: d.DrawMolecule(nm)

 

In [10]: d.AddMoleculeMetadata(nm)

 

In [11]: d.FinishDrawing()

 

In [12]: svg = d.GetDrawingText()

 

In [14]: print(svg)





 







http://www.rdkit.org/xml; version="0.9">















 

This gets you the information you need to connect bond indices to the atoms, 
but I suspect that's not what you're looking for.

 

In general you are guaranteed that the order of the bonds in the output SVG is 
the same as the order in the input molecule, but you can have multiple paths 
for a given bond. For example here, where the end atoms have different colors:

 

In [25]: print(svg)





 





OH



http://www.rdkit.org/xml; version="0.9">











 

What are you looking to be able to do? That may make it easier to either come 
up with a work around or figure out what a new feature addition might look like.

 

-greg

 

 

 

 

On Mon, Dec 3, 2018 at 6:57 PM Lukas Pravda  wrote:

Hi all,

 

I was wondering if there is a way how you can tag  elements (bonds) in 
the svg created by rdkit.

 

i.e. transform something like this: 





 

Into:





 

Or similar. I’ve found possibility of tagging atoms in the SVG using 
Draw.rdMolDraw2D.MolDraw2DSVG.drawOptions() method that exposes property 
includeAtomTags. This then renders following additional elements into the SVG:

rdkit:atom idx="4" label="O-" x="153.479" y="82.8259" />

 

But I have not seen anything like this for bonds (latest release of RDKIT and 
python). Thanks, in advance for any hints. I was wondering about using 
highlightBondLists and then based on the svg infer the bond annotation, but 
that seems to be a bit of an overkill.

 

Cheers,

Lukas

 

 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bond tags in SVGs

2018-12-04 Thread Greg Landrum
Hi Lukas,

There's not currently a way to do this at the moment. The closest you can
get is by calling AddMoleculeMetadata():

In [6]: d = Draw.MolDraw2DSVG(200,200)

In [8]: d.DrawMolecule(nm)

In [10]: d.AddMoleculeMetadata(nm)

In [11]: d.FinishDrawing()

In [12]: svg = d.GetDrawingText()

In [14]: print(svg)


 



http://www.rdkit.org/xml; version="0.9">









This gets you the information you need to connect bond indices to the
atoms, but I suspect that's not what you're looking for.

In general you are guaranteed that the order of the bonds in the output SVG
is the same as the order in the input molecule, but you can have multiple
paths for a given bond. For example here, where the end atoms have
different colors:

In [25]: print(svg)


 


OH

http://www.rdkit.org/xml; version="0.9">







What are you looking to be able to do? That may make it easier to either
come up with a work around or figure out what a new feature addition might
look like.

-greg





On Mon, Dec 3, 2018 at 6:57 PM Lukas Pravda  wrote:

> Hi all,
>
>
>
> I was wondering if there is a way how you can tag  elements (bonds)
> in the svg created by rdkit.
>
>
>
> i.e. transform something like this:
>
> 
>
> 
>
>
>
> Into:
>
> 
>
> 
>
>
>
> Or similar. I’ve found possibility of tagging atoms in the SVG using
> Draw.rdMolDraw2D.MolDraw2DSVG.drawOptions() method that exposes property
> includeAtomTags. This then renders following additional elements into the
> SVG:
>
> rdkit:atom idx="4" label="O-" x="153.479" y="82.8259" />
>
>
>
> But I have not seen anything like this for bonds (latest release of RDKIT
> and python). Thanks, in advance for any hints. I was wondering about using
> *highlightBondLists* and then based on the svg infer the bond annotation,
> but that seems to be a bit of an overkill.
>
>
>
> Cheers,
>
> Lukas
>
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Double Bond Stereochemistry in the RDKit

2018-12-04 Thread Brian Cole
Hi Kovas,

For your use-case #2 should suffice, "set STEREOCIS/STEREOTRANS tags +
manually set stereo atoms". This is what the EnumerateStereoisomers code
does:
https://github.com/rdkit/rdkit/blob/master/rdkit/Chem/EnumerateStereoisomers.py#L38

As to what is the 'ground truth', that is a more difficult question that I
fear the answer may be 'none of them'. STEREOCIS/STEREOTRANS are rather
recent additions to the RDKit API, while we strived to make sure
STEREOCIS/STEREOTRANS across the RDKit, there are probably looming bugs in
untested parts of the RDKit that don't handle them properly. However, I
think those other APIs should be fixed to handle them properly, so please
do report any problems you spot into the github issue tracker.

Cheers,
Brian



On Mon, Dec 3, 2018 at 7:00 PM Kovas Palunas 
wrote:

> Hi All,
>
>
>
> I’m looking for a bit more clarity regarding double bond stereochem in
> RDKit.  Currently, my understanding is that there are 3 ways to currently
> store this information:
>
>
>
>1. STEREOE/STEREOZ tags + stereo atoms on either side of bond set by
>CIP ranks, as computed when calling MolFromSmiles to make a new molecule or
>AssignStereochemistry on an existing molecule
>2. Manually set STEREOCIS/STEREOTRANS tags + manually set stereo atoms
>3. ENDUPRIGHT/etc. single bond directionality tags, which are set when
>reading a molecule from smiles/inchi/mol file
>
>
>
> Is one of these methods the “ground truth” that is looked for by RDKit
> functions that care about this info, like the substructure matching code or
> the SMILES writing code?
>
>
>
> I am currently working on code that mutates molecules using a
> predetermined list of changes to be made to the molecule.  I’d like to be
> able to include bond stereochemistry changing/creation/destruction here,
> and was thinking of doing so using the STEREOCIS/STEREOTRANS tags (and also
> providing the reference stereo atoms).  Before I do this I want to make
> sure that molecules with these tags will be handled correctly by other
> RDKit functions downstream.  Would these tags be a good choice here?  Are
> there any caveats I should keep in mind as I work with this information?
>
>
>
> Thanks!
>
>
>
> - Kovas
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss