Re: [Rdkit-discuss] Explicit hydrogens in substructure search
The question has no sense indeed, sorry for confusion. Everything seems working now. Thank you! Andrew 15.01.2018 06:56, Greg Landrum >On Thu, Jan 11, 2018 at 7:23 PM, Andrey wrote: > > > > > I managed to get it working for Python wrapper. Could you please give me > > an idea how to implement it for Postgres cartridge? > > > > > I don't understand the question. The blog post I pointed you to earlier in > the thread: > http://rdkit.blogspot.ch/2016/07/tuning-substructure-queries-ii.html > focuses on using this functionality with the cartridge. > > Did that not work for you or are you looking for something different? > > > > Kind regards, > > > > Andrew > > > > > > > > 13.12.2017 08:58, Greg Landrum > > >On Tue, Dec 12, 2017 at 7:28 PM, Andrey wrote: > > > > > > > > > > > Does this depend on removeHs() function? I mean, to make MergeQueryHs() > > > > work, should I do removeHs=False first for all compounds in my > > database, to > > > > preserve implicit\explicit hydrogens in their structure? > > > > > > > > > > The MergeQueryHs() functionality is primarily intended to be used for > > > molecules where the Hs have been removed. > > > > > > -greg > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > -- реклама --- http://FREEhost.UA - Купи хостинг или домен и получи сертификат Google AdWords номиналом 900 грн.в подарок! - http://goo.gl/EcgF4I -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Explicit hydrogens in substructure search
On Thu, Jan 11, 2018 at 7:23 PM, Andrey wrote: > > I managed to get it working for Python wrapper. Could you please give me > an idea how to implement it for Postgres cartridge? > > I don't understand the question. The blog post I pointed you to earlier in the thread: http://rdkit.blogspot.ch/2016/07/tuning-substructure-queries-ii.html focuses on using this functionality with the cartridge. Did that not work for you or are you looking for something different? > Kind regards, > > Andrew > > > > 13.12.2017 08:58, Greg Landrum > >On Tue, Dec 12, 2017 at 7:28 PM, Andrey wrote: > > > > > > > > Does this depend on removeHs() function? I mean, to make MergeQueryHs() > > > work, should I do removeHs=False first for all compounds in my > database, to > > > preserve implicit\explicit hydrogens in their structure? > > > > > > > The MergeQueryHs() functionality is primarily intended to be used for > > molecules where the Hs have been removed. > > > > -greg > > > > > > > > > > > > > > > > > -- реклама --- > Программа для автоматизации бизнеса для ленивых эгоистов. > CRM OneBox - https://goo.gl/TDv2xT > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Explicit hydrogens in substructure search
Hi Greg, First of all, many thanks for all your help! I managed to get it working for Python wrapper. Could you please give me an idea how to implement it for Postgres cartridge? Kind regards, Andrew 13.12.2017 08:58, Greg Landrum >On Tue, Dec 12, 2017 at 7:28 PM, Andrey wrote: > > > > > Does this depend on removeHs() function? I mean, to make MergeQueryHs() > > work, should I do removeHs=False first for all compounds in my database, to > > preserve implicit\explicit hydrogens in their structure? > > > > The MergeQueryHs() functionality is primarily intended to be used for > molecules where the Hs have been removed. > > -greg -- реклама --- Программа для автоматизации бизнеса для ленивых эгоистов. CRM OneBox - https://goo.gl/TDv2xT -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Explicit hydrogens in substructure search
On Tue, Dec 12, 2017 at 7:28 PM, Andrey wrote: > > Does this depend on removeHs() function? I mean, to make MergeQueryHs() > work, should I do removeHs=False first for all compounds in my database, to > preserve implicit\explicit hydrogens in their structure? > The MergeQueryHs() functionality is primarily intended to be used for molecules where the Hs have been removed. -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Explicit hydrogens in substructure search
Hi Greg, Does this depend on removeHs() function? I mean, to make MergeQueryHs() work, should I do removeHs=False first for all compounds in my database, to preserve implicit\explicit hydrogens in their structure? Thank you! Andrew 30.11.2017 22:26, Andrey >That's awesome, many thanks for your help Greg. The blog article is great too. >I'll try this out and let you know if there's any success. > > Kind regards, > > Andrew > > > 30.11.2017 08:15, Greg Landrum > >Hi Andrey, > > > > > > On Thu, Nov 30, 2017 at 1:17 AM, Andrey wrote: > > > > > Dear RDKit community, > > > > > > I'm setting up a chemical search engine based on RDKit, and I have > > > question about accounting explicit hydrogens. > > > I'm using Ketcher and Marvin JS as molecular editors to draw structure > > > queries for searching among ~100K compounds. > > > > > > Here's an example search queries: > > > > > > 1. C1=CC=NC(N)=C1 > > > 2. C1=CC=NC(N([H])[H])=C1 > > > > > > Both queries is the same molecule (pyridin-2-amine), but query#2 has two > > > explicitly indicated hydrogens in NH2 group. > > > > > > In both cases, when I do substructure search I get the same list of > > > compounds with substituted NH2 group, which is OK for query#1, but for > > > query#2 the NH2 substitution should be avoided. > > > It seems that the system (RDKit?) is not sensitive to explicitly indicated > > > hydrogens which makes the substructure search not efficient enough for my > > > needs. > > > > > > > The RDKit has a function called MergeQueryHs() that's intended to help out > > in cases like this. Here's a quick demo of how this works: > > > > Start by building the query molecules: > > > > In [5]: params = Chem.SmilesParserParams() > > > > In [6]: params.removeHs=False > > > > In [8]: p1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1',params) > > > > In [9]: p2 = Chem.MolFromSmiles('C1=CC=NC(N([H])[H])=C1',params) > > > > In [10]: p3 = Chem.MergeQueryHs(p2) > > > > > > Here are the two test molecules: > > > > In [11]: m1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1') > > > > In [12]: m2 = Chem.MolFromSmiles('C1=CC=NC(N(C))=C1') > > > > > > > > And the results: > > > > In [13]: m1.HasSubstructMatch(p1) > > Out[13]: True > > > > In [14]: m1.HasSubstructMatch(p2) > > Out[14]: False > > > > In [15]: m1.HasSubstructMatch(p3) > > Out[15]: True > > > > In [16]: m2.HasSubstructMatch(p1) > > Out[16]: True > > > > In [17]: m2.HasSubstructMatch(p2) > > Out[17]: False > > > > In [18]: m2.HasSubstructMatch(p3) > > Out[18]: False > > > > You may also find this blog post and the links therein helpful: > > http://rdkit.blogspot.co.uk/2016/07/tuning-substructure-queries-ii.html > > > > I hope this helps, > > -greg > > > > > > > > > I'm new to RDKit and I'd very appreciate any thoughts on how this problem > > > could be solved. Are there any settings in RDKit related to this? > > > > > > Thank you in advance, > > > > > > Andrew > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- реклама --- Программа управления бизнесом для ленивых эгоистов CRM OneBox https://goo.gl/PdBVV6 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Explicit hydrogens in substructure search
That's awesome, many thanks for your help Greg. The blog article is great too. I'll try this out and let you know if there's any success. Kind regards, Andrew 30.11.2017 08:15, Greg Landrum >Hi Andrey, > > > On Thu, Nov 30, 2017 at 1:17 AM, Andrey wrote: > > > Dear RDKit community, > > > > I'm setting up a chemical search engine based on RDKit, and I have > > question about accounting explicit hydrogens. > > I'm using Ketcher and Marvin JS as molecular editors to draw structure > > queries for searching among ~100K compounds. > > > > Here's an example search queries: > > > > 1. C1=CC=NC(N)=C1 > > 2. C1=CC=NC(N([H])[H])=C1 > > > > Both queries is the same molecule (pyridin-2-amine), but query#2 has two > > explicitly indicated hydrogens in NH2 group. > > > > In both cases, when I do substructure search I get the same list of > > compounds with substituted NH2 group, which is OK for query#1, but for > > query#2 the NH2 substitution should be avoided. > > It seems that the system (RDKit?) is not sensitive to explicitly indicated > > hydrogens which makes the substructure search not efficient enough for my > > needs. > > > > The RDKit has a function called MergeQueryHs() that's intended to help out > in cases like this. Here's a quick demo of how this works: > > Start by building the query molecules: > > In [5]: params = Chem.SmilesParserParams() > > In [6]: params.removeHs=False > > In [8]: p1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1',params) > > In [9]: p2 = Chem.MolFromSmiles('C1=CC=NC(N([H])[H])=C1',params) > > In [10]: p3 = Chem.MergeQueryHs(p2) > > > Here are the two test molecules: > > In [11]: m1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1') > > In [12]: m2 = Chem.MolFromSmiles('C1=CC=NC(N(C))=C1') > > > > And the results: > > In [13]: m1.HasSubstructMatch(p1) > Out[13]: True > > In [14]: m1.HasSubstructMatch(p2) > Out[14]: False > > In [15]: m1.HasSubstructMatch(p3) > Out[15]: True > > In [16]: m2.HasSubstructMatch(p1) > Out[16]: True > > In [17]: m2.HasSubstructMatch(p2) > Out[17]: False > > In [18]: m2.HasSubstructMatch(p3) > Out[18]: False > > You may also find this blog post and the links therein helpful: > http://rdkit.blogspot.co.uk/2016/07/tuning-substructure-queries-ii.html > > I hope this helps, > -greg > > > > > I'm new to RDKit and I'd very appreciate any thoughts on how this problem > > could be solved. Are there any settings in RDKit related to this? > > > > Thank you in advance, > > > > Andrew -- реклама --- Программа управления бизнесом для ленивых эгоистов CRM OneBox https://goo.gl/PdBVV6 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Explicit hydrogens in substructure search
Hi Andrey, On Thu, Nov 30, 2017 at 1:17 AM, Andrey wrote: > Dear RDKit community, > > I'm setting up a chemical search engine based on RDKit, and I have > question about accounting explicit hydrogens. > I'm using Ketcher and Marvin JS as molecular editors to draw structure > queries for searching among ~100K compounds. > > Here's an example search queries: > > 1. C1=CC=NC(N)=C1 > 2. C1=CC=NC(N([H])[H])=C1 > > Both queries is the same molecule (pyridin-2-amine), but query#2 has two > explicitly indicated hydrogens in NH2 group. > > In both cases, when I do substructure search I get the same list of > compounds with substituted NH2 group, which is OK for query#1, but for > query#2 the NH2 substitution should be avoided. > It seems that the system (RDKit?) is not sensitive to explicitly indicated > hydrogens which makes the substructure search not efficient enough for my > needs. > The RDKit has a function called MergeQueryHs() that's intended to help out in cases like this. Here's a quick demo of how this works: Start by building the query molecules: In [5]: params = Chem.SmilesParserParams() In [6]: params.removeHs=False In [8]: p1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1',params) In [9]: p2 = Chem.MolFromSmiles('C1=CC=NC(N([H])[H])=C1',params) In [10]: p3 = Chem.MergeQueryHs(p2) Here are the two test molecules: In [11]: m1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1') In [12]: m2 = Chem.MolFromSmiles('C1=CC=NC(N(C))=C1') And the results: In [13]: m1.HasSubstructMatch(p1) Out[13]: True In [14]: m1.HasSubstructMatch(p2) Out[14]: False In [15]: m1.HasSubstructMatch(p3) Out[15]: True In [16]: m2.HasSubstructMatch(p1) Out[16]: True In [17]: m2.HasSubstructMatch(p2) Out[17]: False In [18]: m2.HasSubstructMatch(p3) Out[18]: False You may also find this blog post and the links therein helpful: http://rdkit.blogspot.co.uk/2016/07/tuning-substructure-queries-ii.html I hope this helps, -greg > I'm new to RDKit and I'd very appreciate any thoughts on how this problem > could be solved. Are there any settings in RDKit related to this? > > Thank you in advance, > > Andrew > > -- реклама --- > Программа управления бизнесом для ленивых эгоистов > CRM OneBox https://goo.gl/PdBVV6 > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss