Re: [Rdkit-discuss] Smarts conversion help

2019-03-27 Thread Greg Landrum
As Pat pointed out, what's happening here is that you have an explicit H in
your SMARTS.
The RDKit has a function, MergeQueryHs(), to merge this into the query of
the atom it's connected to. Here's the application to your example:

In [2]:
s=Chem.MolFromSmarts('[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]')
   ...: x=Chem.MolFromSmiles('O=C3C1=C2C(=CC=C1)C=CC=C2C(N3[H])=O')
   ...: x.HasSubstructMatch(s)
Out[2]: False

In [4]: s2 = Chem.MergeQueryHs(s)

In [5]: x.HasSubstructMatch(s2)
Out[5]: True


-greg


On Wed, Mar 27, 2019 at 1:02 AM Li, Xiaobo [xiaoboli] <
xiaobo...@liverpool.ac.uk> wrote:

> Dear all,
>
>
> I have a molecule in Smiles
>
>
> O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O (Copied from Chemdraw)
>
>
> then, using online converter to get Smarts (
> https://pubchem.ncbi.nlm.nih.gov/edit2/index.html)
>
>
> [#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]
>
>
> But I got 'false' with following code
>
>
>
> s=Chem.MolFromSmarts('[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]')
> x=Chem.MolFromSmiles('O=C3C1=C2C(=CC=C1)C=CC=C2C(N3[H])=O')
> x.HasSubstructMatch(s)
>
>
> Then I tried this:
>
>
> s=Chem.MolFromSmarts('[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]')
> s
>
>
>
>
> Any suggestion?
>
>
> Thanks.
>
>
> Best regards,
>
>
> Xiaobo Li
>
>
>
>
>
>
>
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Smarts conversion help

2019-03-27 Thread Jason Biggs
On Tue, Mar 26, 2019 at 8:22 PM Patrick Walters  wrote:

> HI Xiaobo,
>
> There's an explicit hydrogen in the SMARTS that shouldn't be there.  I
> also wouldn't include the single bonds around the ring closures.
>

To be fair, that explicit hydrogen was in the original SMILES string, so it
is reasonable to find it in the SMARTS string if the conversion program
didn't make the same choice as RDKit to remove all hydrogens on parsing.
If you disable hydrogen removal in RDKit you do find a match,

smi = "O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O"
params = Chem.SmilesParserParams()
params.removeHs=False
mol = Chem.MolFromSmiles(smi, params)

s=Chem.MolFromSmarts("[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-[#1])=[#8]")
mol.HasSubstructMatch(s)
// True


 Whether you want to remove hydrogens when parsing SMILES strings or
whether you want to represent those hydrogens as explicit vertices in the
pattern, that is up to you.



> '[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-*[#1]*)=[#8]')
>
> from rdkit import Chem
> from rdkit.Chem import Draw
>
> smi = "O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O"
> mol = Chem.MolFromSmiles(smi)
> mol_list = [mol]
> core = Chem.MolFromSmarts("[#8]=[#6]3-c1c2c(ccc1)2-[#6](-[#7H]3)=[#8]")
> Draw.MolsToGridImage(mol_list,highlightAtomLists=[x.GetSubstructMatch(core)
> for x in mol_list])
>
> [image: image.png]
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Smarts conversion help

2019-03-26 Thread Patrick Walters
HI Xiaobo,

There's an explicit hydrogen in the SMARTS that shouldn't be there.  I also
wouldn't include the single bonds around the ring closures.

'[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-*[#1]*)=[#8]')

from rdkit import Chem
from rdkit.Chem import Draw

smi = "O=C(C1=C2C(C=CC=C23)=CC=C1)N([H])C3=O"
mol = Chem.MolFromSmiles(smi)
mol_list = [mol]
core = Chem.MolFromSmarts("[#8]=[#6]3-c1c2c(ccc1)2-[#6](-[#7H]3)=[#8]")
Draw.MolsToGridImage(mol_list,highlightAtomLists=[x.GetSubstructMatch(core)
for x in mol_list])

[image: image.png]
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss