Re: [Rdkit-discuss] HasSubstructureMatch using query atom list with hydrogen
Hi Susan, adding explicit hydrogens is an expensive operation that you normally try to avoid unless you have to (e.g., because you need to deal with 3D molecule coordinates). I would rather change the query in order to use implicit H queries, e.g. something like: q = Chem.MolFromSmarts('[$([#6;H0](-[#17])-[#17]),$([#6;H1]-[#17])]') m = Chem.MolFromSmiles('ClC(C)(C)Cl') print(m.HasSubstructMatch(q)) True m = Chem.MolFromSmiles('CC(C)(C)Cl') print(m.HasSubstructMatch(q)) False m = Chem.MolFromSmiles('CC(C)Cl') print(m.HasSubstructMatch(q)) True Cheers, p. On Fri, Jul 22, 2022 at 12:58 PM Susan Leung wrote: > Sorry I replied too soon, just another quick question. > > I am trying to use the postgres cartridge. Is there any function to add > hydrogens to the molecules once in the database? Or should this be done as > a preprocessing step beforehand? E.g. I have a .smi or .sdf should all the > molecules already have explicit hydrogens? > > Thanks, > > Susan > > On Fri, Jul 22, 2022 at 11:51 AM Susan Leung > wrote: > >> Ah, great thanks Paolo! >> >> On Fri, Jul 22, 2022 at 11:44 AM Paolo Tosco >> wrote: >> >>> Hi Susan, >>> >>> If you use [#1] in your SMARTS query, for your molecule to match there >>> should be a real hydrogen atom in your molecule graph, while in your >>> molecule you only have implicit hydrogens, unless you explicitly add them >>> calling Chem.AddHs(): >>> >>> print(Chem.AddHs(m).HasSubstructMatch(q)) >>> True >>> >>> Cheers, >>> p. >>> >>> On Fri, Jul 22, 2022 at 12:13 PM Susan Leung >>> wrote: >>> Hi all, I am trying to do substructure search using query atom lists, but I am seeing unexpected behaviour when I have hydrogen in my query atom list… import rdkit print(rdkit.__version__) from rdkit import Chem >>2022.03.2 ## having carbon in the query atom list seems to work as expected q = Chem.MolFromSmarts('[#17,#6]-[#6]-[#17]') m = Chem.MolFromSmiles('ClCCl') print(m.HasSubstructMatch(q)) m = Chem.MolFromSmiles('CCCl') print(m.HasSubstructMatch(q)) >>True >>True ## having hydrogen in the query atom list doesn't work as expected q = Chem.MolFromSmarts('[#17,#1]-[#6]-[#17]') m = Chem.MolFromSmiles('ClCCl') print(m.HasSubstructMatch(q)) m = Chem.MolFromSmiles('CCl') print(m.HasSubstructMatch(q)) >>True >>False Best wishes, Susan ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] HasSubstructureMatch using query atom list with hydrogen
Sorry I replied too soon, just another quick question. I am trying to use the postgres cartridge. Is there any function to add hydrogens to the molecules once in the database? Or should this be done as a preprocessing step beforehand? E.g. I have a .smi or .sdf should all the molecules already have explicit hydrogens? Thanks, Susan On Fri, Jul 22, 2022 at 11:51 AM Susan Leung wrote: > Ah, great thanks Paolo! > > On Fri, Jul 22, 2022 at 11:44 AM Paolo Tosco > wrote: > >> Hi Susan, >> >> If you use [#1] in your SMARTS query, for your molecule to match there >> should be a real hydrogen atom in your molecule graph, while in your >> molecule you only have implicit hydrogens, unless you explicitly add them >> calling Chem.AddHs(): >> >> print(Chem.AddHs(m).HasSubstructMatch(q)) >> True >> >> Cheers, >> p. >> >> On Fri, Jul 22, 2022 at 12:13 PM Susan Leung >> wrote: >> >>> Hi all, >>> >>> >>> >>> I am trying to do substructure search using query atom lists, but I am >>> seeing unexpected behaviour when I have hydrogen in my query atom list… >>> >>> >>> >>> import rdkit >>> >>> print(rdkit.__version__) >>> >>> from rdkit import Chem >>> >>> >>> >>> >>2022.03.2 >>> >>> >>> >>> ## having carbon in the query atom list seems to work as expected >>> >>> q = Chem.MolFromSmarts('[#17,#6]-[#6]-[#17]') >>> >>> m = Chem.MolFromSmiles('ClCCl') >>> >>> print(m.HasSubstructMatch(q)) >>> >>> m = Chem.MolFromSmiles('CCCl') >>> >>> print(m.HasSubstructMatch(q)) >>> >>> >>> >>> >>True >>> >>> >>True >>> >>> >>> >>> ## having hydrogen in the query atom list doesn't work as expected >>> >>> q = Chem.MolFromSmarts('[#17,#1]-[#6]-[#17]') >>> >>> m = Chem.MolFromSmiles('ClCCl') >>> >>> print(m.HasSubstructMatch(q)) >>> >>> m = Chem.MolFromSmiles('CCl') >>> >>> print(m.HasSubstructMatch(q)) >>> >>> >>> >>> >>True >>> >>False >>> >>> >>> >>> Best wishes, >>> >>> >>> >>> Susan >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] HasSubstructureMatch using query atom list with hydrogen
Ah, great thanks Paolo! On Fri, Jul 22, 2022 at 11:44 AM Paolo Tosco wrote: > Hi Susan, > > If you use [#1] in your SMARTS query, for your molecule to match there > should be a real hydrogen atom in your molecule graph, while in your > molecule you only have implicit hydrogens, unless you explicitly add them > calling Chem.AddHs(): > > print(Chem.AddHs(m).HasSubstructMatch(q)) > True > > Cheers, > p. > > On Fri, Jul 22, 2022 at 12:13 PM Susan Leung > wrote: > >> Hi all, >> >> >> >> I am trying to do substructure search using query atom lists, but I am >> seeing unexpected behaviour when I have hydrogen in my query atom list… >> >> >> >> import rdkit >> >> print(rdkit.__version__) >> >> from rdkit import Chem >> >> >> >> >>2022.03.2 >> >> >> >> ## having carbon in the query atom list seems to work as expected >> >> q = Chem.MolFromSmarts('[#17,#6]-[#6]-[#17]') >> >> m = Chem.MolFromSmiles('ClCCl') >> >> print(m.HasSubstructMatch(q)) >> >> m = Chem.MolFromSmiles('CCCl') >> >> print(m.HasSubstructMatch(q)) >> >> >> >> >>True >> >> >>True >> >> >> >> ## having hydrogen in the query atom list doesn't work as expected >> >> q = Chem.MolFromSmarts('[#17,#1]-[#6]-[#17]') >> >> m = Chem.MolFromSmiles('ClCCl') >> >> print(m.HasSubstructMatch(q)) >> >> m = Chem.MolFromSmiles('CCl') >> >> print(m.HasSubstructMatch(q)) >> >> >> >> >>True >> >>False >> >> >> >> Best wishes, >> >> >> >> Susan >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] HasSubstructureMatch using query atom list with hydrogen
Hi Susan, If you use [#1] in your SMARTS query, for your molecule to match there should be a real hydrogen atom in your molecule graph, while in your molecule you only have implicit hydrogens, unless you explicitly add them calling Chem.AddHs(): print(Chem.AddHs(m).HasSubstructMatch(q)) True Cheers, p. On Fri, Jul 22, 2022 at 12:13 PM Susan Leung wrote: > Hi all, > > > > I am trying to do substructure search using query atom lists, but I am > seeing unexpected behaviour when I have hydrogen in my query atom list… > > > > import rdkit > > print(rdkit.__version__) > > from rdkit import Chem > > > > >>2022.03.2 > > > > ## having carbon in the query atom list seems to work as expected > > q = Chem.MolFromSmarts('[#17,#6]-[#6]-[#17]') > > m = Chem.MolFromSmiles('ClCCl') > > print(m.HasSubstructMatch(q)) > > m = Chem.MolFromSmiles('CCCl') > > print(m.HasSubstructMatch(q)) > > > > >>True > > >>True > > > > ## having hydrogen in the query atom list doesn't work as expected > > q = Chem.MolFromSmarts('[#17,#1]-[#6]-[#17]') > > m = Chem.MolFromSmiles('ClCCl') > > print(m.HasSubstructMatch(q)) > > m = Chem.MolFromSmiles('CCl') > > print(m.HasSubstructMatch(q)) > > > > >>True > >>False > > > > Best wishes, > > > > Susan > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] HasSubstructureMatch using query atom list with hydrogen
Hi all, I am trying to do substructure search using query atom lists, but I am seeing unexpected behaviour when I have hydrogen in my query atom list… import rdkit print(rdkit.__version__) from rdkit import Chem >>2022.03.2 ## having carbon in the query atom list seems to work as expected q = Chem.MolFromSmarts('[#17,#6]-[#6]-[#17]') m = Chem.MolFromSmiles('ClCCl') print(m.HasSubstructMatch(q)) m = Chem.MolFromSmiles('CCCl') print(m.HasSubstructMatch(q)) >>True >>True ## having hydrogen in the query atom list doesn't work as expected q = Chem.MolFromSmarts('[#17,#1]-[#6]-[#17]') m = Chem.MolFromSmiles('ClCCl') print(m.HasSubstructMatch(q)) m = Chem.MolFromSmiles('CCl') print(m.HasSubstructMatch(q)) >>True >>False Best wishes, Susan ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss