Re: [Rdkit-discuss] Finding out the origin of product atoms after applying a reaction
As an FYI: the changes to carry reactant atom indices over into reaction products are already merged and will be in the next release. Here’s the issue connected to the changes: https://github.com/rdkit/rdkit/issues/1269 Best, -greg On Mon, 17 Sep 2018 at 18:23, Ivan Tubert-Brohman < ivan.tubert-broh...@schrodinger.com> wrote: > Hi Connor, > > Thank you for your suggestions! I think the isotope hack will work for me > for now, but for the longer term it would be nicer to have the official > version of RDKit provide sufficient atom mapping information, so I'll > consider that as well. > > Ivan > > On Mon, Sep 17, 2018 at 11:52 AM, Connor Coley wrote: > >> Hi Ivan, >> >> This is something I ran into a couple years ago - it's a pretty easy fix. >> >> One approach is to update the source with a few lines to copy over the >> atom map numbers from the reactants to the products as a new field. You can >> see the necessary changes to the code in my forked version here: >> https://github.com/rdkit/rdkit/commit/0a8393fbf89e486ed67f2a44f9a7ea8d9f2efd95 >> >> Another approach is more hacky but might be good enough for your use >> case. If your reactions don't involve isotopic changes or require specific >> isotopes, you can set unique isotope numbers for every reacting atom. Those >> will be preserved in the products so you can get the atom-atom mapping >> after running the reaction. >> >> Connor >> >> On Mon, Sep 17, 2018 at 10:36 AM Ivan Tubert-Brohman < >> ivan.tubert-broh...@schrodinger.com> wrote: >> >>> I'd like to know where each atom in a reaction product came from, but as >>> far as I can tell, RDKit doesn't provide enough information. Here's what I >>> found out empirically so far. >>> >>> There are four kinds of product atoms: >>> >>> 1. New atoms: atoms are defined in the product template without a >>> mapping number. These can't be mapped to reactant atoms, so there's no >>> issue. >>> >>> 2. Unmatched atoms: atoms that were not matched by any atom from the >>> reactant template but were carried over to the product because they were >>> connected (directly or indirectly) to a mapped reactant atom. These have >>> the "react_atom_idx" property, which holds the atom index of the atom in >>> the reactant molecule. This is useful as long as the reactant side has only >>> one molecule; when there are multiple molecules, it is not clear which >>> reactant molecule a product atom came from. >>> >>> 3. Mapped dummy atoms: atoms which are defined as dummies in the product >>> template and have a mapping number (e.g., [*:1]). These also get a >>> "react_atom_idx" property, as well as an "old_mapno" property which holds >>> the mapping number (1 for [*:1]) >>> >>> 4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in >>> the product template and have a mapping number (e.g., [C:1]). These >>> have "old_mapno", but no "react_atom_idx". >>> >>> One thing I tried was to add my own unique atom identifier property to >>> all reactant atoms before applying the reaction, but that didn't help >>> because mapped atoms (types 3 and 4) don't preserve user properties (I >>> guess these atoms are seen as "replacements" for the atoms they mapped, >>> rather than "copies"?) >>> >>> I'd be willing to hack the RDKit source code if necessary and contribute >>> my changes, but before starting such a project I'd like to hear if it's >>> reasonable. >>> >>> One possibility would be to preserve user properties for mapped atoms. >>> >>> Alternatively we could add react_atom_idx to mapped non-dummy atoms, and >>> also add a new "react_mol_idx" property? >>> >>> Best, >>> Ivan >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Finding out the origin of product atoms after applying a reaction
Hi Connor, Thank you for your suggestions! I think the isotope hack will work for me for now, but for the longer term it would be nicer to have the official version of RDKit provide sufficient atom mapping information, so I'll consider that as well. Ivan On Mon, Sep 17, 2018 at 11:52 AM, Connor Coley wrote: > Hi Ivan, > > This is something I ran into a couple years ago - it's a pretty easy fix. > > One approach is to update the source with a few lines to copy over the > atom map numbers from the reactants to the products as a new field. You can > see the necessary changes to the code in my forked version here: > https://github.com/rdkit/rdkit/commit/0a8393fbf89e486ed67f2a44f9a7ea > 8d9f2efd95 > > Another approach is more hacky but might be good enough for your use case. > If your reactions don't involve isotopic changes or require specific > isotopes, you can set unique isotope numbers for every reacting atom. Those > will be preserved in the products so you can get the atom-atom mapping > after running the reaction. > > Connor > > On Mon, Sep 17, 2018 at 10:36 AM Ivan Tubert-Brohman schrodinger.com> wrote: > >> I'd like to know where each atom in a reaction product came from, but as >> far as I can tell, RDKit doesn't provide enough information. Here's what I >> found out empirically so far. >> >> There are four kinds of product atoms: >> >> 1. New atoms: atoms are defined in the product template without a mapping >> number. These can't be mapped to reactant atoms, so there's no issue. >> >> 2. Unmatched atoms: atoms that were not matched by any atom from the >> reactant template but were carried over to the product because they were >> connected (directly or indirectly) to a mapped reactant atom. These have >> the "react_atom_idx" property, which holds the atom index of the atom in >> the reactant molecule. This is useful as long as the reactant side has only >> one molecule; when there are multiple molecules, it is not clear which >> reactant molecule a product atom came from. >> >> 3. Mapped dummy atoms: atoms which are defined as dummies in the product >> template and have a mapping number (e.g., [*:1]). These also get a >> "react_atom_idx" property, as well as an "old_mapno" property which holds >> the mapping number (1 for [*:1]) >> >> 4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in the >> product template and have a mapping number (e.g., [C:1]). These have >> "old_mapno", >> but no "react_atom_idx". >> >> One thing I tried was to add my own unique atom identifier property to >> all reactant atoms before applying the reaction, but that didn't help >> because mapped atoms (types 3 and 4) don't preserve user properties (I >> guess these atoms are seen as "replacements" for the atoms they mapped, >> rather than "copies"?) >> >> I'd be willing to hack the RDKit source code if necessary and contribute >> my changes, but before starting such a project I'd like to hear if it's >> reasonable. >> >> One possibility would be to preserve user properties for mapped atoms. >> >> Alternatively we could add react_atom_idx to mapped non-dummy atoms, and >> also add a new "react_mol_idx" property? >> >> Best, >> Ivan >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Finding out the origin of product atoms after applying a reaction
Hi Ivan, This is something I ran into a couple years ago - it's a pretty easy fix. One approach is to update the source with a few lines to copy over the atom map numbers from the reactants to the products as a new field. You can see the necessary changes to the code in my forked version here: https://github.com/rdkit/rdkit/commit/0a8393fbf89e486ed67f2a44f9a7ea8d9f2efd95 Another approach is more hacky but might be good enough for your use case. If your reactions don't involve isotopic changes or require specific isotopes, you can set unique isotope numbers for every reacting atom. Those will be preserved in the products so you can get the atom-atom mapping after running the reaction. Connor On Mon, Sep 17, 2018 at 10:36 AM Ivan Tubert-Brohman < ivan.tubert-broh...@schrodinger.com> wrote: > I'd like to know where each atom in a reaction product came from, but as > far as I can tell, RDKit doesn't provide enough information. Here's what I > found out empirically so far. > > There are four kinds of product atoms: > > 1. New atoms: atoms are defined in the product template without a mapping > number. These can't be mapped to reactant atoms, so there's no issue. > > 2. Unmatched atoms: atoms that were not matched by any atom from the > reactant template but were carried over to the product because they were > connected (directly or indirectly) to a mapped reactant atom. These have > the "react_atom_idx" property, which holds the atom index of the atom in > the reactant molecule. This is useful as long as the reactant side has only > one molecule; when there are multiple molecules, it is not clear which > reactant molecule a product atom came from. > > 3. Mapped dummy atoms: atoms which are defined as dummies in the product > template and have a mapping number (e.g., [*:1]). These also get a > "react_atom_idx" property, as well as an "old_mapno" property which holds > the mapping number (1 for [*:1]) > > 4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in the > product template and have a mapping number (e.g., [C:1]). These have > "old_mapno", > but no "react_atom_idx". > > One thing I tried was to add my own unique atom identifier property to all > reactant atoms before applying the reaction, but that didn't help because > mapped atoms (types 3 and 4) don't preserve user properties (I guess these > atoms are seen as "replacements" for the atoms they mapped, rather than > "copies"?) > > I'd be willing to hack the RDKit source code if necessary and contribute > my changes, but before starting such a project I'd like to hear if it's > reasonable. > > One possibility would be to preserve user properties for mapped atoms. > > Alternatively we could add react_atom_idx to mapped non-dummy atoms, and > also add a new "react_mol_idx" property? > > Best, > Ivan > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Finding out the origin of product atoms after applying a reaction
I'd like to know where each atom in a reaction product came from, but as far as I can tell, RDKit doesn't provide enough information. Here's what I found out empirically so far. There are four kinds of product atoms: 1. New atoms: atoms are defined in the product template without a mapping number. These can't be mapped to reactant atoms, so there's no issue. 2. Unmatched atoms: atoms that were not matched by any atom from the reactant template but were carried over to the product because they were connected (directly or indirectly) to a mapped reactant atom. These have the "react_atom_idx" property, which holds the atom index of the atom in the reactant molecule. This is useful as long as the reactant side has only one molecule; when there are multiple molecules, it is not clear which reactant molecule a product atom came from. 3. Mapped dummy atoms: atoms which are defined as dummies in the product template and have a mapping number (e.g., [*:1]). These also get a "react_atom_idx" property, as well as an "old_mapno" property which holds the mapping number (1 for [*:1]) 4. Mapped non-dummy atoms: atoms which are NOT defined as dummies in the product template and have a mapping number (e.g., [C:1]). These have "old_mapno", but no "react_atom_idx". One thing I tried was to add my own unique atom identifier property to all reactant atoms before applying the reaction, but that didn't help because mapped atoms (types 3 and 4) don't preserve user properties (I guess these atoms are seen as "replacements" for the atoms they mapped, rather than "copies"?) I'd be willing to hack the RDKit source code if necessary and contribute my changes, but before starting such a project I'd like to hear if it's reasonable. One possibility would be to preserve user properties for mapped atoms. Alternatively we could add react_atom_idx to mapped non-dummy atoms, and also add a new "react_mol_idx" property? Best, Ivan ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss