Re: [Rdkit-discuss] Smarts conversion help
As Pat pointed out, what's happening here is that you have an explicit H in your SMARTS. The RDKit has a function, MergeQueryHs(), to merge this into the query of the atom it's connected to. Here's the application to your example: In [2]: s=Chem.MolFromSmarts('[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]') ...: x=Chem.MolFromSmiles('O=C3C1=C2C(=CC=C1)C=CC=C2C(N3[H])=O') ...: x.HasSubstructMatch(s) Out[2]: False In [4]: s2 = Chem.MergeQueryHs(s) In [5]: x.HasSubstructMatch(s2) Out[5]: True -greg On Wed, Mar 27, 2019 at 1:02 AM Li, Xiaobo [xiaoboli] < xiaobo...@liverpool.ac.uk> wrote: > Dear all, > > > I have a molecule in Smiles > > > O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O (Copied from Chemdraw) > > > then, using online converter to get Smarts ( > https://pubchem.ncbi.nlm.nih.gov/edit2/index.html) > > > [#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8] > > > But I got 'false' with following code > > > > s=Chem.MolFromSmarts('[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]') > x=Chem.MolFromSmiles('O=C3C1=C2C(=CC=C1)C=CC=C2C(N3[H])=O') > x.HasSubstructMatch(s) > > > Then I tried this: > > > s=Chem.MolFromSmarts('[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]') > s > > > > > Any suggestion? > > > Thanks. > > > Best regards, > > > Xiaobo Li > > > > > > > > > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Smarts conversion help
On Tue, Mar 26, 2019 at 8:22 PM Patrick Walters wrote: > HI Xiaobo, > > There's an explicit hydrogen in the SMARTS that shouldn't be there. I > also wouldn't include the single bonds around the ring closures. > To be fair, that explicit hydrogen was in the original SMILES string, so it is reasonable to find it in the SMARTS string if the conversion program didn't make the same choice as RDKit to remove all hydrogens on parsing. If you disable hydrogen removal in RDKit you do find a match, smi = "O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O" params = Chem.SmilesParserParams() params.removeHs=False mol = Chem.MolFromSmiles(smi, params) s=Chem.MolFromSmarts("[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]") mol.HasSubstructMatch(s) // True Whether you want to remove hydrogens when parsing SMILES strings or whether you want to represent those hydrogens as explicit vertices in the pattern, that is up to you. > '[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-*[#1]*)=[#8]') > > from rdkit import Chem > from rdkit.Chem import Draw > > smi = "O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O" > mol = Chem.MolFromSmiles(smi) > mol_list = [mol] > core = Chem.MolFromSmarts("[#8]=[#6]3-c1c2c(ccc1)2-[#6](-[#7H]3)=[#8]") > Draw.MolsToGridImage(mol_list,highlightAtomLists=[x.GetSubstructMatch(core) > for x in mol_list]) > > [image: image.png] > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Smarts conversion help
HI Xiaobo, There's an explicit hydrogen in the SMARTS that shouldn't be there. I also wouldn't include the single bonds around the ring closures. '[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-*[#1]*)=[#8]') from rdkit import Chem from rdkit.Chem import Draw smi = "O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O" mol = Chem.MolFromSmiles(smi) mol_list = [mol] core = Chem.MolFromSmarts("[#8]=[#6]3-c1c2c(ccc1)2-[#6](-[#7H]3)=[#8]") Draw.MolsToGridImage(mol_list,highlightAtomLists=[x.GetSubstructMatch(core) for x in mol_list]) [image: image.png] ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss