[Rdkit-discuss] Substructure matching misbehaving with bridged atoms?

2023-03-15 Thread James Wallace
I've been using the substructure match with query molecules to do an
R-Group decomposition. This works well, except in the case where the query
molecule contains a 'bridged atom in a ring. Take this example (I've
replaced the irrelevant part with a Y atom for confidentiality)

FC1=CC=C(N2CC3CC2CN3S(=O)(=O)C2=CC=C([Y])C=C2)C2=CC=CC=C12

[image: image.png]

Using the following as a query, you get the usual result you'd expect:

*-N1CC2CC1CN2S(-*)(=O)=O

[image: image.png]

However, I also see when I do the match:

[image: image.png]

Even switching back to the R-group code from before the latest refactor
seems to have this issue, like the query molecule is perceived as having
the actual bridged ring, and the smaller ring bounded by the bridge atom.

Am I missing an obvious setting to exclude those latter matches, as
obviously the groups generated do not match reality in this case.

Such as it is, I'm doing everything via a version of Pat Walters R-Group
method from the older RDkit, with the list to generate SMILES coming from:

match_list = test_mol.GetSubstructMatches(self.query_mol, False)
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] substructure matching

2020-07-21 Thread Jan Halborg Jensen
I get both to be True using version 2020.03.04

On 21 Jul 2020, at 14.08, Quoc-Tuan DO 
mailto:quoctuan...@greenpharma.com>> wrote:

Hello,

I am not very familiar with smiles/smarts and find the following results quite 
puzzling:


>>> patt = 
>>> Chem.MolFromSmiles('c1ccc(cc1)C~C2NC~Cc3c23.c1ccc(cc1)C~C2NC~Cc3c23')

>>> mol = 
>>> Chem.MolFromSmiles('COc1ccc2cc1Oc1ccc(cc1)CC1N(C)CCc3c1c1Oc4cc5C(C2)NCCc5cc4Oc1c(c3)OC')

>>> print mol.HasSubstructMatch(patt)

False


>>> mol = 
>>> Chem.MolFromSmiles('COc1ccc7cc1Oc2ccc(cc2)CC3N(C)CCc4c3cc(c(c4)OC)Oc5ccc6c(c5)CCNC6C7')

>>> print mol.HasSubstructMatch(patt)

True

It seems that a presence of an extra Ph - O - Ph makes the difference but I am 
not sure why. How should the smarts be to have positive results for both smiles 
?

Thank you in advance for your help.
Best regards,
Quoc-Tuan

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7C66a9c734e4b148ec5f4808d82d6f5793%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637309303542102797sdata=pQvzevQ7fWtPRenNrM19eKxbVDSsK1hff1TcxGsZLfk%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] substructure matching

2020-07-21 Thread Ivan Tubert-Brohman
Hi Quoc-Tuan,

I can't reproduce your observations; I get True in both cases. Which
version of RDKit are you using?

One thing to note is that you are parsing a SMARTS with MolFromSmiles. I
wouldn't recommend that in general, although it appears that in this case
RDKit is lenient enough to accept "~"  in a SMILES and turn it into a
QueryBond. What happens if you use MolFromSmarts instead?

Ivan

On Tue, Jul 21, 2020 at 8:12 AM Quoc-Tuan DO 
wrote:

> Hello,
>
> I am not very familiar with smiles/smarts and find the following results
> quite puzzling:
>
> >>> patt =
> Chem.MolFromSmiles('c1ccc(cc1)C~C2NC~Cc3c23.c1ccc(cc1)C~C2NC~Cc3c23')
>
> >>> mol =
> Chem.MolFromSmiles('COc1ccc2cc1Oc1ccc(cc1)CC1N(C)CCc3c1c1Oc4cc5C(C2)NCCc5cc4Oc1c(c3)OC')
>
> >>> print mol.HasSubstructMatch(patt)
>
> False
>
> >>> mol =
> Chem.MolFromSmiles('COc1ccc7cc1Oc2ccc(cc2)CC3N(C)CCc4c3cc(c(c4)OC)Oc5ccc6c(c5)CCNC6C7')
>
> >>> print mol.HasSubstructMatch(patt)
>
> True
>
> It seems that a presence of an extra Ph* - O - *Ph makes the difference
> but I am not sure why. How should the smarts be to have positive results
> for both smiles ?
>
> Thank you in advance for your help.
>
> Best regards,
>
> Quoc-Tuan
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] substructure matching

2020-07-21 Thread Quoc-Tuan DO

  
  
Hello,
  I am not very familiar with smiles/smarts and find the following results quite puzzling:

  

  >>> patt = Chem.MolFromSmiles('c1ccc(cc1)C~C2NC~Cc3c23.c1ccc(cc1)C~C2NC~Cc3c23')
  >>> mol = Chem.MolFromSmiles('COc1ccc2cc1Oc1ccc(cc1)CC1N(C)CCc3c1c1Oc4cc5C(C2)NCCc5cc4Oc1c(c3)OC')
  >>> print mol.HasSubstructMatch(patt)

  False
  

  >>> mol = Chem.MolFromSmiles('COc1ccc7cc1Oc2ccc(cc2)CC3N(C)CCc4c3cc(c(c4)OC)Oc5ccc6c(c5)CCNC6C7')
  >>> print mol.HasSubstructMatch(patt)
  True


  It seems that a presence of an extra Ph - O - Ph makes the difference but I am not sure why. How should the smarts be to have positive results for both smiles ?
  

  Thank you in advance for your help.
  Best regards,
  Quoc-Tuan
  

  

  

  


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] substructure matching

2020-07-10 Thread Quoc-Tuan DO

  
  
Hello,
I am not very familiar with smiles/smarts and find the following results quite puzzling:



>>> patt = Chem.MolFromSmiles('c1ccc(cc1)C~C2NC~Cc3c23.c1ccc(cc1)C~C2NC~Cc3c23')
>>> mol = Chem.MolFromSmiles('COc1ccc2cc1Oc1ccc(cc1)CC1N(C)CCc3c1c1Oc4cc5C(C2)NCCc5cc4Oc1c(c3)OC')
>>> print mol.HasSubstructMatch(patt)

False


>>> mol = Chem.MolFromSmiles('COc1ccc7cc1Oc2ccc(cc2)CC3N(C)CCc4c3cc(c(c4)OC)Oc5ccc6c(c5)CCNC6C7')
>>> print mol.HasSubstructMatch(patt)
True


It seems that a presence of an extra Ph - O - Ph makes the difference but I am not sure why. How should the smarts be to have positive results for both smiles ?


Thank you in advance for your help.
Best regards,
Quoc-Tuan





  


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss