On Oct 3, 2019, at 20:34, Ondrej Gutten via Rdkit-discuss
<[email protected]> wrote:
> # MCS is a benzene
> my_mcs = Chem.MolFromSmiles(res.smarts)
The res.smarts (or res.smartsString if you use the rdFMCS module) returns a
SMARTS string, not a SMILES string. You should be using Chem.MolFromSmarts() in
the code I quoted.
More specifically, the MCS SMARTS pattern is: [#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1
This can be interpreted as a SMILES, because RDKit supports "#6" as part of its
extensions to the SMILES grammar. The "#6" means an element with atomic number
6.
>>> Chem.CanonSmiles("[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1")
'[c]1[c][c][c][c][c]1'
However, benzene has a different SMILES:
>>> Chem.CanonSmiles("c1ccccc1")
'c1ccccc1'
>>> Chem.MolToSmiles(Chem.MolFromSmiles("c1ccccc1"), allBondsExplicit=True,
allHsExplicit=True)
'[cH]1:[cH]:[cH]:[cH]:[cH]:[cH]:1'
Each of the benzene atoms has a hydrogen on it.
The difference appears when you call:
mol1.HasSubstructMatch(my_mcs)
There is a difference if my_mcs is a molecule from a SMILES vs. from a SMARTS.
They have different definitions of what it means to match. One of the
differences is that a SMILES-made molecule considers the hydrogen counts:
>>> mol = Chem.MolFromSmiles("OCc1ccccc1")
>>> query = Chem.MolFromSmiles("OC");mol.GetSubstructMatches(query)
((0, 1),)
>>> query = Chem.MolFromSmiles("[O]C");mol.GetSubstructMatches(query)
()
>>> query = Chem.MolFromSmiles("[OH]C");mol.GetSubstructMatches(query)
((0, 1),)
while if I made the query from a SMARTS:
>>> query = Chem.MolFromSmarts("[O]C");mol.GetSubstructMatches(query)
((0, 1),)
Cheers,
Andrew
[email protected]
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss