Re: [Rdkit-discuss] Finding out the origin of product atoms after applying a reaction

2018-09-17 Thread Greg Landrum
As an FYI: the changes to carry reactant atom indices over into reaction
products are already merged and will be in the next release.

Here’s the issue connected to the changes:
https://github.com/rdkit/rdkit/issues/1269


Best,
-greg


On Mon, 17 Sep 2018 at 18:23, Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Hi Connor,
>
> Thank you for your suggestions! I think the isotope hack will work for me
> for now, but for the longer term it would be nicer to have the official
> version of RDKit provide sufficient atom mapping information, so I'll
> consider that as well.
>
> Ivan
>
> On Mon, Sep 17, 2018 at 11:52 AM, Connor Coley  wrote:
>
>> Hi Ivan,
>>
>> This is something I ran into a couple years ago - it's a pretty easy fix.
>>
>> One approach is to update the source with a few lines to copy over the
>> atom map numbers from the reactants to the products as a new field. You can
>> see the necessary changes to the code in my forked version here:
>> https://github.com/rdkit/rdkit/commit/0a8393fbf89e486ed67f2a44f9a7ea8d9f2efd95
>>
>> Another approach is more hacky but might be good enough for your use
>> case. If your reactions don't involve isotopic changes or require specific
>> isotopes, you can set unique isotope numbers for every reacting atom. Those
>> will be preserved in the products so you can get the atom-atom mapping
>> after running the reaction.
>>
>> Connor
>>
>> On Mon, Sep 17, 2018 at 10:36 AM Ivan Tubert-Brohman <
>> ivan.tubert-broh...@schrodinger.com> wrote:
>>
>>> I'd like to know where each atom in a reaction product came from, but as
>>> far as I can tell, RDKit doesn't provide enough information. Here's what I
>>> found out empirically so far.
>>>
>>> There are four kinds of product atoms:
>>>
>>> 1. New atoms: atoms are defined in the product template without a
>>> mapping number. These can't be mapped to reactant atoms, so there's no
>>> issue.
>>>
>>> 2. Unmatched atoms: atoms that were not matched by any atom from the
>>> reactant template but were carried over to the product because they were
>>> connected (directly or indirectly) to a mapped reactant atom. These have
>>> the "react_atom_idx" property, which holds the atom index of the atom in
>>> the reactant molecule. This is useful as long as the reactant side has only
>>> one molecule; when there are multiple molecules, it is not clear which
>>> reactant molecule a product atom came from.
>>>
>>> 3. Mapped dummy atoms: atoms which are defined as dummies in the product
>>> template and have a mapping number (e.g., [*:1]). These also get a
>>> "react_atom_idx" property, as well as an "old_mapno" property which holds
>>> the mapping number (1 for [*:1])
>>>
>>> 4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in
>>> the product template and have a mapping number (e.g., [C:1]).  These
>>> have "old_mapno", but no "react_atom_idx".
>>>
>>> One thing I tried was to add my own unique atom identifier property to
>>> all reactant atoms before applying the reaction, but that didn't help
>>> because mapped atoms (types 3 and 4) don't preserve user properties (I
>>> guess these atoms are seen as "replacements" for the atoms they mapped,
>>> rather than "copies"?)
>>>
>>> I'd be willing to hack the RDKit source code if necessary and contribute
>>> my changes, but before starting such a project I'd like to hear if it's
>>> reasonable.
>>>
>>> One possibility would be to preserve user properties for mapped atoms.
>>>
>>> Alternatively we could add react_atom_idx to mapped non-dummy atoms, and
>>> also add a new "react_mol_idx" property?
>>>
>>> Best,
>>> Ivan
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Finding out the origin of product atoms after applying a reaction

2018-09-17 Thread Ivan Tubert-Brohman
Hi Connor,

Thank you for your suggestions! I think the isotope hack will work for me
for now, but for the longer term it would be nicer to have the official
version of RDKit provide sufficient atom mapping information, so I'll
consider that as well.

Ivan

On Mon, Sep 17, 2018 at 11:52 AM, Connor Coley  wrote:

> Hi Ivan,
>
> This is something I ran into a couple years ago - it's a pretty easy fix.
>
> One approach is to update the source with a few lines to copy over the
> atom map numbers from the reactants to the products as a new field. You can
> see the necessary changes to the code in my forked version here:
> https://github.com/rdkit/rdkit/commit/0a8393fbf89e486ed67f2a44f9a7ea
> 8d9f2efd95
>
> Another approach is more hacky but might be good enough for your use case.
> If your reactions don't involve isotopic changes or require specific
> isotopes, you can set unique isotope numbers for every reacting atom. Those
> will be preserved in the products so you can get the atom-atom mapping
> after running the reaction.
>
> Connor
>
> On Mon, Sep 17, 2018 at 10:36 AM Ivan Tubert-Brohman  schrodinger.com> wrote:
>
>> I'd like to know where each atom in a reaction product came from, but as
>> far as I can tell, RDKit doesn't provide enough information. Here's what I
>> found out empirically so far.
>>
>> There are four kinds of product atoms:
>>
>> 1. New atoms: atoms are defined in the product template without a mapping
>> number. These can't be mapped to reactant atoms, so there's no issue.
>>
>> 2. Unmatched atoms: atoms that were not matched by any atom from the
>> reactant template but were carried over to the product because they were
>> connected (directly or indirectly) to a mapped reactant atom. These have
>> the "react_atom_idx" property, which holds the atom index of the atom in
>> the reactant molecule. This is useful as long as the reactant side has only
>> one molecule; when there are multiple molecules, it is not clear which
>> reactant molecule a product atom came from.
>>
>> 3. Mapped dummy atoms: atoms which are defined as dummies in the product
>> template and have a mapping number (e.g., [*:1]). These also get a
>> "react_atom_idx" property, as well as an "old_mapno" property which holds
>> the mapping number (1 for [*:1])
>>
>> 4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in the
>> product template and have a mapping number (e.g., [C:1]).  These have 
>> "old_mapno",
>> but no "react_atom_idx".
>>
>> One thing I tried was to add my own unique atom identifier property to
>> all reactant atoms before applying the reaction, but that didn't help
>> because mapped atoms (types 3 and 4) don't preserve user properties (I
>> guess these atoms are seen as "replacements" for the atoms they mapped,
>> rather than "copies"?)
>>
>> I'd be willing to hack the RDKit source code if necessary and contribute
>> my changes, but before starting such a project I'd like to hear if it's
>> reasonable.
>>
>> One possibility would be to preserve user properties for mapped atoms.
>>
>> Alternatively we could add react_atom_idx to mapped non-dummy atoms, and
>> also add a new "react_mol_idx" property?
>>
>> Best,
>> Ivan
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Finding out the origin of product atoms after applying a reaction

2018-09-17 Thread Connor Coley
Hi Ivan,

This is something I ran into a couple years ago - it's a pretty easy fix.

One approach is to update the source with a few lines to copy over the atom
map numbers from the reactants to the products as a new field. You can see
the necessary changes to the code in my forked version here:
https://github.com/rdkit/rdkit/commit/0a8393fbf89e486ed67f2a44f9a7ea8d9f2efd95

Another approach is more hacky but might be good enough for your use case.
If your reactions don't involve isotopic changes or require specific
isotopes, you can set unique isotope numbers for every reacting atom. Those
will be preserved in the products so you can get the atom-atom mapping
after running the reaction.

Connor

On Mon, Sep 17, 2018 at 10:36 AM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> I'd like to know where each atom in a reaction product came from, but as
> far as I can tell, RDKit doesn't provide enough information. Here's what I
> found out empirically so far.
>
> There are four kinds of product atoms:
>
> 1. New atoms: atoms are defined in the product template without a mapping
> number. These can't be mapped to reactant atoms, so there's no issue.
>
> 2. Unmatched atoms: atoms that were not matched by any atom from the
> reactant template but were carried over to the product because they were
> connected (directly or indirectly) to a mapped reactant atom. These have
> the "react_atom_idx" property, which holds the atom index of the atom in
> the reactant molecule. This is useful as long as the reactant side has only
> one molecule; when there are multiple molecules, it is not clear which
> reactant molecule a product atom came from.
>
> 3. Mapped dummy atoms: atoms which are defined as dummies in the product
> template and have a mapping number (e.g., [*:1]). These also get a
> "react_atom_idx" property, as well as an "old_mapno" property which holds
> the mapping number (1 for [*:1])
>
> 4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in the
> product template and have a mapping number (e.g., [C:1]).  These have 
> "old_mapno",
> but no "react_atom_idx".
>
> One thing I tried was to add my own unique atom identifier property to all
> reactant atoms before applying the reaction, but that didn't help because
> mapped atoms (types 3 and 4) don't preserve user properties (I guess these
> atoms are seen as "replacements" for the atoms they mapped, rather than
> "copies"?)
>
> I'd be willing to hack the RDKit source code if necessary and contribute
> my changes, but before starting such a project I'd like to hear if it's
> reasonable.
>
> One possibility would be to preserve user properties for mapped atoms.
>
> Alternatively we could add react_atom_idx to mapped non-dummy atoms, and
> also add a new "react_mol_idx" property?
>
> Best,
> Ivan
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Finding out the origin of product atoms after applying a reaction

2018-09-17 Thread Ivan Tubert-Brohman
I'd like to know where each atom in a reaction product came from, but as
far as I can tell, RDKit doesn't provide enough information. Here's what I
found out empirically so far.

There are four kinds of product atoms:

1. New atoms: atoms are defined in the product template without a mapping
number. These can't be mapped to reactant atoms, so there's no issue.

2. Unmatched atoms: atoms that were not matched by any atom from the
reactant template but were carried over to the product because they were
connected (directly or indirectly) to a mapped reactant atom. These have
the "react_atom_idx" property, which holds the atom index of the atom in
the reactant molecule. This is useful as long as the reactant side has only
one molecule; when there are multiple molecules, it is not clear which
reactant molecule a product atom came from.

3. Mapped dummy atoms: atoms which are defined as dummies in the product
template and have a mapping number (e.g., [*:1]). These also get a
"react_atom_idx" property, as well as an "old_mapno" property which holds
the mapping number (1 for [*:1])

4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in the
product template and have a mapping number (e.g., [C:1]).  These have
"old_mapno",
but no "react_atom_idx".

One thing I tried was to add my own unique atom identifier property to all
reactant atoms before applying the reaction, but that didn't help because
mapped atoms (types 3 and 4) don't preserve user properties (I guess these
atoms are seen as "replacements" for the atoms they mapped, rather than
"copies"?)

I'd be willing to hack the RDKit source code if necessary and contribute my
changes, but before starting such a project I'd like to hear if it's
reasonable.

One possibility would be to preserve user properties for mapped atoms.

Alternatively we could add react_atom_idx to mapped non-dummy atoms, and
also add a new "react_mol_idx" property?

Best,
Ivan
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss