Re: [Rdkit-discuss] AllChem.ReplaceSubstructs question
Hi Markus, That's a documentation bug and a pointer to a possible useful new feature. Thanks for pointing it out. The current behavior, by design, is that ReplaceSubstructs removes the atoms that match the pattern, adds the atoms from the replacement molecule, and then forms bonds from the first atom in the replacement molecule corresponding to the bonds from the equivalent atom in the original molecule. I'll make the example from the docs asymmetrical so that it's a bit easier to see what's going on: In [44]: repl = Chem.MolFromSmiles('NC') ...: patt = Chem.MolFromSmarts('OC') ...: m = Chem.MolFromSmiles('ClCCOC') ...: rms = AllChem.ReplaceSubstructs(m,patt,repl) In [45]: Chem.MolToSmiles(rms[0]) Out[45]: 'CCl.CNC' In [46]: Chem.MolToSmiles(rms[1]) Out[46]: 'CNCCCl' The first result corresponds to the pattern matching atoms 3 and 2 (numbered from zero) in m. Since only the bonds from the first matching atom (the O, atom 3) are restored, we don't end up creating the bond between the C in the pattern and C1 in the molecule and we get disconnected fragments. The second result corresponds to the pattern matching atoms 3 and 4. Here the bonds from the O are restored; this connects the N to C2 and we end up with a single molecule. It wouldn't be impossible to add an option which changes the behavior so that all bonds from matched atoms in the molecule are restored, but in the meantime the documentation definitely should be corrected. Best, -greg On Fri, Mar 6, 2020 at 2:28 AM Markus Metz wrote: > Hello: > I am puzzled by the output from ReplaceSubstructs as it can produce two > fragments. > So I went and tried the examples in the manual and I observed this: > The example in the intro manual with the recursive smarts pattern works as > expected. > repl = Chem.MolFromSmiles('OC') > patt = Chem.MolFromSmarts('[$(NC(=O))]') > m = Chem.MolFromSmiles('CC(=O)N') > rms = AllChem.ReplaceSubstructs(m,patt,repl) > One expected product is formed. > > The example form the ReplaceSubstructs manual produces two solutions. > I have used the following commands: > from rdkit import Chem > from rdkit.Chem import AllChem > > #ReplaceSubstructs(‘CCOC’,’OC’,’NC’) -> (‘CCNC’,) > repl = Chem.MolFromSmiles('NC') > patt = Chem.MolFromSmarts('OC') > m = Chem.MolFromSmiles('CCOC') > rms = AllChem.ReplaceSubstructs(m,patt,repl) > > for rm in rms: >print(CHem.MolToSmiles(rm)) > > output is: C.CNC and CCNC > > Why is the first result produced? > I checked the mailing list and could find this older threat > https://sourceforge.net/p/rdkit/mailman/message/28777648/. > But the answer to the related question is missing. > > rdkit version is 2020.03.1dev1 > > Best wishes, > Markus > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] AllChem.ReplaceSubstructs question
Hello: I am puzzled by the output from ReplaceSubstructs as it can produce two fragments. So I went and tried the examples in the manual and I observed this: The example in the intro manual with the recursive smarts pattern works as expected. repl = Chem.MolFromSmiles('OC') patt = Chem.MolFromSmarts('[$(NC(=O))]') m = Chem.MolFromSmiles('CC(=O)N') rms = AllChem.ReplaceSubstructs(m,patt,repl) One expected product is formed. The example form the ReplaceSubstructs manual produces two solutions. I have used the following commands: from rdkit import Chem from rdkit.Chem import AllChem #ReplaceSubstructs(‘CCOC’,’OC’,’NC’) -> (‘CCNC’,) repl = Chem.MolFromSmiles('NC') patt = Chem.MolFromSmarts('OC') m = Chem.MolFromSmiles('CCOC') rms = AllChem.ReplaceSubstructs(m,patt,repl) for rm in rms: print(CHem.MolToSmiles(rm)) output is: C.CNC and CCNC Why is the first result produced? I checked the mailing list and could find this older threat https://sourceforge.net/p/rdkit/mailman/message/28777648/. But the answer to the related question is missing. rdkit version is 2020.03.1dev1 Best wishes, Markus ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] AllChem.ReplaceSubstructs
On Thu, Nov 7, 2013 at 4:06 PM, Igor Filippov wrote: > Greg, > > Is it available in c++? Also, just to make sure - the argument is a list > of old positions for each new position? > Yes, and yes. It's MolOps::renumberAtoms(): http://rdkit.org/docs/cppapi/namespaceRDKit_1_1MolOps.html#a96649cb8953d6b3ab6dfd43f639cafd1 -greg -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] AllChem.ReplaceSubstructs
Greg, Is it available in c++? Also, just to make sure - the argument is a list of old positions for each new position? Thanks, Igor On Thu, Nov 7, 2013 at 8:41 AM, Greg Landrum wrote: > Dear Michal, > > > On Thu, Nov 7, 2013 at 12:46 PM, Michal Krompiec < > michal.kromp...@gmail.com> wrote: > >> Hello again, >> I browsed through the sources and I found the answer to my question: >> the atom at index 0 from the replacement is used for the new bond. It >> would be nice to be able to specify the index of this bonding atom as >> a parameter in AllChem.ReplaceSubstructs. >> >> Is it possible to reorder atoms in a molecule (i.e. to have a chosen >> atom at index 0)? >> > > Indeed there is, the functionality was added at the last minute to the > 2013.09 release. > > Here's how you use it: > > In [2]: m = Chem.MolFromSmiles('NCO') > > In [3]: print Chem.MolToMolBlock(m) > > RDKit > > 3 2 0 0 0 0 0 0 0 0999 V2000 > 0.0.0. N 0 0 0 0 0 0 0 0 0 0 0 0 > 0.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0 > 0.0.0. O 0 0 0 0 0 0 0 0 0 0 0 0 > 1 2 1 0 > 2 3 1 0 > M END > > > In [4]: m2 = Chem.RenumberAtoms(m,(1,2,0)) > > In [5]: print Chem.MolToMolBlock(m2) > > RDKit > > 3 2 0 0 0 0 0 0 0 0999 V2000 > 0.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0 > 0.0.0. O 0 0 0 0 0 0 0 0 0 0 0 0 > 0.0.0. N 0 0 0 0 0 0 0 0 0 0 0 0 > 3 1 1 0 > 1 2 1 0 > M END > > > > > > -- > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. > Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] AllChem.ReplaceSubstructs
Dear Michal, On Thu, Nov 7, 2013 at 12:46 PM, Michal Krompiec wrote: > Hello again, > I browsed through the sources and I found the answer to my question: > the atom at index 0 from the replacement is used for the new bond. It > would be nice to be able to specify the index of this bonding atom as > a parameter in AllChem.ReplaceSubstructs. > > Is it possible to reorder atoms in a molecule (i.e. to have a chosen > atom at index 0)? > Indeed there is, the functionality was added at the last minute to the 2013.09 release. Here's how you use it: In [2]: m = Chem.MolFromSmiles('NCO') In [3]: print Chem.MolToMolBlock(m) RDKit 3 2 0 0 0 0 0 0 0 0999 V2000 0.0.0. N 0 0 0 0 0 0 0 0 0 0 0 0 0.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0 0.0.0. O 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 2 3 1 0 M END In [4]: m2 = Chem.RenumberAtoms(m,(1,2,0)) In [5]: print Chem.MolToMolBlock(m2) RDKit 3 2 0 0 0 0 0 0 0 0999 V2000 0.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0 0.0.0. O 0 0 0 0 0 0 0 0 0 0 0 0 0.0.0. N 0 0 0 0 0 0 0 0 0 0 0 0 3 1 1 0 1 2 1 0 M END -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] AllChem.ReplaceSubstructs
Hello again, I browsed through the sources and I found the answer to my question: the atom at index 0 from the replacement is used for the new bond. It would be nice to be able to specify the index of this bonding atom as a parameter in AllChem.ReplaceSubstructs. Is it possible to reorder atoms in a molecule (i.e. to have a chosen atom at index 0)? Best wishes, Michal On 7 November 2013 11:38, Michal Krompiec wrote: > Hello, > I have a question about AllChem.ReplaceSubstructs(mol, > query,replacement). As I understand, it replaces 'query' pattern in > 'mol' by 'replacement' fragment. It is clear which atom from 'mol' is > the joining atom, but which is the joining atom in 'replacement'? The > atom with index=0? Is it possible to specify which atom in the > 'replacement' should be bonded to 'mol'? It would be lovely to be able > to do so, because the only alternative (using reaction SMARTS) is > much, much slower. > If not, is it possible to generate (from a given Mol object) a SMILES > string starting from the specified atom index? > > Thanks, > Michal -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] AllChem.ReplaceSubstructs
Hello, I have a question about AllChem.ReplaceSubstructs(mol, query,replacement). As I understand, it replaces 'query' pattern in 'mol' by 'replacement' fragment. It is clear which atom from 'mol' is the joining atom, but which is the joining atom in 'replacement'? The atom with index=0? Is it possible to specify which atom in the 'replacement' should be bonded to 'mol'? It would be lovely to be able to do so, because the only alternative (using reaction SMARTS) is much, much slower. If not, is it possible to generate (from a given Mol object) a SMILES string starting from the specified atom index? Thanks, Michal -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss