Hi Luca,
On Fri, Jun 17, 2016 at 5:28 PM, Luca Fenu - Network <
fenu_l...@network.lilly.com> wrote:
>
>
> I’m back again with what seems to me like puzzling behaviour from the
> rdfMCS.findMCS function when dealing with fused ring systems:
>
>
>
> > mol3RA = Chem.MolFromSmiles('c1ccc2cc3ccccc3cc2c1')
>
> > mol3RB = Chem.MolFromSmiles('c1ccc2c(c1)ccc1ccccc21')
>
> > rdFMCS.FindMCS([mol3RA, mol3RB], matchValences=True,
>
> matchChiralTag=True,
>
> ringMatchesRingOnly=True,
>
> completeRingsOnly=True # False will return the same result
>
> ).smartsString
>
> '[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1'
>
> Or, in pictures:
>
> The MCS shown above is the same whether or not I turn on the
> ‘completeRingsOnly flag.
>
> So, the atoms are all there, but the MCS ring can’t be closed because the
> attachment point doesn’t match.
>
> Since we’re requesting completeRingsOnly, wouldn’t make sense for the MCS
> be reduced to just a naphthalene?
>
> What’s the rationale behind the current implementation? Is there a way to
> force it to drop the open circle?
>
>
That's a bug in the current implementation. I'll take a look at it and see
if I can track it down.
> And another, related, question: is there a way to make the MCS computation
> even more restrictive,
>
> only returning a ring as part of the MCS *if the whole fused system is
> matched*, not just a portion (albeit a complete ring by itself?). In the
> example described above, there would be no MCS.
>
>
>
> I’d be fine with some post-processing to the same aim. Another example:
>
>
>
>
That's an interesting question. I think you can get at least part of what
you're looking for using the "atom labeling" trick from here:
http://rdkit.org/docs/Cookbook.html#using-custom-mcs-atom-types
Here's a pass at the first step of that:
In [70]: def label(a):
return 100*a.GetOwningMol().GetRingInfo().NumAtomRings(a.GetIdx()) +
a.GetAtomicNum()
....:
In [71]: bz = Chem.MolFromSmiles('c1ccccc1')
In [72]: napth = Chem.MolFromSmiles('c1ccc2ccccc2c1')
In [73]: rdFMCS.FindMCS([bz,napth]).smartsString
Out[73]: '[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1'
In [74]: nms = [Chem.Mol(x) for x in (bz,napth)]
In [75]: for nm in nms:
....: for at in nm.GetAtoms():
....: at.SetIsotope(label(at))
....:
In [76]: mcs =
rdFMCS.FindMCS(nms,atomCompare=rdFMCS.AtomCompare.CompareIsotopes)
In [77]: mcs.smartsString
Out[77]: '[106*](:[106*]:[106*]):[106*]'
In [78]: mcs =
rdFMCS.FindMCS(nms,atomCompare=rdFMCS.AtomCompare.CompareIsotopes,completeRingsOnly=True)
In [79]: mcs.smartsString
Out[79]: ''
Does that help?
-greg
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports. http://sdm.link/zohomanageengine
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss