That's awesome, many thanks for your help Greg. The blog article is great too. 
I'll try this out and let you know if there's any success.

Kind regards,

Andrew


30.11.2017 08:15, Greg Landrum <greg.land...@gmail.com>
>Hi Andrey,
> 
> 
> On Thu, Nov 30, 2017 at 1:17 AM, Andrey <pti...@ua.fm> wrote:
> 
> > Dear RDKit community,
> >
> > I'm setting up a chemical search engine based on RDKit, and I have
> > question about accounting explicit hydrogens.
> > I'm using Ketcher and Marvin JS as molecular editors to draw structure
> > queries for searching among ~100K compounds.
> >
> > Here's an example search queries:
> >
> > 1. C1=CC=NC(N)=C1
> > 2. C1=CC=NC(N([H])[H])=C1
> >
> > Both queries is the same molecule (pyridin-2-amine), but query#2 has two
> > explicitly indicated hydrogens in NH2 group.
> >
> > In both cases, when I do substructure search I get the same list of
> > compounds with substituted NH2 group, which is OK for query#1, but for
> > query#2 the NH2 substitution should be avoided.
> > It seems that the system (RDKit?) is not sensitive to explicitly indicated
> > hydrogens which makes the substructure search not efficient enough for my
> > needs.
> >
> 
> The RDKit has a function called MergeQueryHs() that's intended to help out
> in cases like this. Here's a quick demo of how this works:
> 
> Start by building the query molecules:
> 
> In [5]: params = Chem.SmilesParserParams()
> 
> In [6]: params.removeHs=False
> 
> In [8]: p1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1',params)
> 
> In [9]: p2 = Chem.MolFromSmiles('C1=CC=NC(N([H])[H])=C1',params)
> 
> In [10]: p3 = Chem.MergeQueryHs(p2)
> 
> 
> Here are the two test molecules:
> 
> In [11]: m1 = Chem.MolFromSmiles('C1=CC=NC(N)=C1')
> 
> In [12]: m2 = Chem.MolFromSmiles('C1=CC=NC(N(C))=C1')
> 
> 
> 
> And the results:
> 
> In [13]: m1.HasSubstructMatch(p1)
> Out[13]: True
> 
> In [14]: m1.HasSubstructMatch(p2)
> Out[14]: False
> 
> In [15]: m1.HasSubstructMatch(p3)
> Out[15]: True
> 
> In [16]: m2.HasSubstructMatch(p1)
> Out[16]: True
> 
> In [17]: m2.HasSubstructMatch(p2)
> Out[17]: False
> 
> In [18]: m2.HasSubstructMatch(p3)
> Out[18]: False
> 
> You may also find this blog post and the links therein helpful:
> http://rdkit.blogspot.co.uk/2016/07/tuning-substructure-queries-ii.html
> 
> I hope this helps,
> -greg
> 
> 
> 
> > I'm new to RDKit and I'd very appreciate any thoughts on how this problem
> > could be solved. Are there any settings in RDKit related to this?
> >
> > Thank you in advance,
> >
> > Andrew


-- реклама -----------------------------------------------------------
Программа управления бизнесом для ленивых эгоистов 
CRM OneBox https://goo.gl/PdBVV6

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to