On Oct 3, 2019, at 20:34, Ondrej Gutten via Rdkit-discuss 
<rdkit-discuss@lists.sourceforge.net> wrote:
> # MCS is a benzene
> my_mcs = Chem.MolFromSmiles(res.smarts)

The res.smarts (or res.smartsString if you use the rdFMCS module) returns a 
SMARTS string, not a SMILES string. You should be using Chem.MolFromSmarts() in 
the code I quoted.

More specifically, the MCS SMARTS pattern is:  [#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1

This can be interpreted as a SMILES, because RDKit supports "#6" as part of its 
extensions to the SMILES grammar. The "#6" means an element with atomic number 
6.

  >>> Chem.CanonSmiles("[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1")
  '[c]1[c][c][c][c][c]1'

However, benzene has a different SMILES:

  >>> Chem.CanonSmiles("c1ccccc1")
  'c1ccccc1'
  >>> Chem.MolToSmiles(Chem.MolFromSmiles("c1ccccc1"), allBondsExplicit=True, 
allHsExplicit=True)
  '[cH]1:[cH]:[cH]:[cH]:[cH]:[cH]:1'

Each of the benzene atoms has a hydrogen on it.

The difference appears when you call:

   mol1.HasSubstructMatch(my_mcs)

There is a difference if my_mcs is a molecule from a SMILES vs. from a SMARTS. 
They have different definitions of what it means to match. One of the 
differences is that a SMILES-made molecule considers the hydrogen counts:

>>> mol = Chem.MolFromSmiles("OCc1ccccc1")
>>> query = Chem.MolFromSmiles("OC");mol.GetSubstructMatches(query)
((0, 1),)
>>> query = Chem.MolFromSmiles("[O]C");mol.GetSubstructMatches(query)
()
>>> query = Chem.MolFromSmiles("[OH]C");mol.GetSubstructMatches(query)
((0, 1),)

while if I made the query from a SMARTS:

>>> query = Chem.MolFromSmarts("[O]C");mol.GetSubstructMatches(query)
((0, 1),)


Cheers,


                                Andrew
                                da...@dalkescientific.com




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to