Hi Hannes,
This is an interesting one.
First a general point: the python MCS code isn't really actively supported
any more. The C++ code should be used instead.
Having said that: in this case, I think what the Python code is doing,
though counterintuitive, is technically correct.
Using a slightly simpler pair of example molecules (attached), I get the
same type of result you do.
The c++ version returns: [#6]1:[#6]:[#6]2:[#6]:[#6]:[#6]:[#6]:[#6]:2:[#7]:1
whlle the python version returns:
[*]~@1~@[*]~@[*]~@[*](~!@[*]~@2~@[*]~@[*]~@[*]~@[*]~@[*]~@[*]~@[*]~@[*]~@2)~@[*]~@1
The larger ring that the python version returns is the envelope of the
fused ring system. This is certainly a complete ring, just not an SSSR ring.
I'm going to have to stew on this to decide if I think that this is a bug
in the C++ implementation or if its better for it to return the more
intuitive (visually obvious) answer. Any feedback from the community on
this is welcome.
-greg
On Fri, Jan 22, 2016 at 9:24 AM, Hannes Loeffler <[email protected]
> wrote:
> Hi,
>
> I have attached some hackish code making use of FMCS which shows some
> strange behaviour in the Python implementation of FMCS. When I run with
>
> python mcs.py c++ 60
>
> I get the expected result (16 atoms in the match) regardless of the
> timeout I set (last argument; I guess 1s is the smallest that can be
> done). The 'c++' triggers the C++ implementation.
>
> But with a large timeout, that is such that it won't get exceeded, and
> the Python implementation, e.g.
>
> python mcs.py python 120
>
> I get a totally weird result with 27 atoms in the match (the example
> has 29 atoms)!
>
> If I set the timeout such that the search exceeds it (1 second works on
> my machine) the Python implementation too gives the correct result.
>
>
> Many thanks,
> Hannes.
>
>
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
import sys
import rdkit.Chem as rd
if sys.argv[1]:
fmcs_impl = sys.argv[1]
else:
fmcs_impl = 'c++'
if fmcs_impl == 'c++':
from rdkit.Chem.rdFMCS import FindMCS, AtomCompare, BondCompare
_params = dict(maximizeBonds = False, threshold = 1.0,
verbose = False, matchValences = False,
ringMatchesRingOnly = True, completeRingsOnly = True,
atomCompare = AtomCompare.CompareAny,
bondCompare = BondCompare.CompareAny)
else:
from rdkit.Chem.MCS import FindMCS
_params = dict(minNumAtoms = 2, maximize = 'atoms', atomCompare = 'any',
bondCompare = 'any', matchValences = False,
ringMatchesRingOnly = True, completeRingsOnly = True,
threshold = None)
_params.update(timeout = int(sys.argv[2]))
mol1 = rd.MolFromSmiles('C1CCC(C1)c1cc2ccccc2[nH]1')
mol2 = rd.MolFromSmiles('C1CCC(C1)c1cccc2cc[nH]c12')
print(rd.MolToSmiles(mol1))
print(rd.MolToSmiles(mol2))
mcs = FindMCS((mol1, mol2), **_params)
if fmcs_impl == 'c++':
smarts = mcs.smartsString
canceled = mcs.canceled
else:
smarts = mcs.smarts
canceled = not mcs.completed
print('cancelled =', canceled, '; nAtoms =', mcs.numAtoms, '; nBonds =',
mcs.numBonds, '; SMARTS =', smarts)
p = rd.MolFromSmarts(smarts)
m1 = mol1.GetSubstructMatch(p, useChirality=False)
m2 = mol2.GetSubstructMatch(p, useChirality=False)
print('match1:', m1)
print('match2:', m2)
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss