I meant the smarts not matching the atom comparisons might be a bug, the "or" 
approach for combining atoms and bonds is much cleaner and makes a lot more 
sense.  I can see use cases for both actually.

----
Brian Kelley

> On Feb 25, 2016, at 1:31 AM, Greg Landrum <[email protected]> wrote:
> 
> 
> 
>> On Thu, Feb 25, 2016 at 2:16 AM, Brian Kelley <[email protected]> wrote:
>> It is hard to tell if this is a bug or not, however:
>> 
>>                      atomCompare=rdFMCS.AtomCompare.CompareAny,
>>                      bondCompare=rdFMCS.BondCompare.CompareAny,
>> 
>> Means that any atom matches any other atom and any bond matches any other 
>> bond.  The smarts being returned does not have the appropriate wildcards.
>> 
>> '[#6](:[#6](-[#6]):[#7]:[#7]:[#6]=[#8]):[#6]'
>> 
>> The mcs you actually computed should have been something like:
>> 
>> mcs = Chem.MolFromSmarts('[*](~[*](~[*])~[*]~[*]~[*]~[*])~[*]')
>> print moli_noh.GetSubstructMatch(mcs)
>> print molj_noh.GetSubstructMatch(mcs)
>> 
>> (0, 1, 7, 9, 10, 2, 11, 12)
>> (0, 1, 6, 8, 9, 5, 3, 2)
>> 
>> 
>> So it looks like the smarts generation portion of the MCS code doesn't apply 
>> the rules of the mcs matcher.  Bug?  Maybe :)
> 
> Certainly not. It's a feature, and one I really like. There is definitely a 
> bug somewhere that's causing the problems with Gaetano's example, but here's 
> a small example that shows how it's supposed to work:
> 
> In [21]: mol1 = Chem.MolFromSmiles('Cc1nc(O)ccc1')
> 
> In [22]: mol2 = Chem.MolFromSmiles('Cc1cc(O)ccc1')
> 
> In [23]: mcs = rdFMCS.FindMCS([mol1,mol2],
>                      timeout=20,
>                      atomCompare=rdFMCS.AtomCompare.CompareAny,
>                      bondCompare=rdFMCS.BondCompare.CompareAny,
>                      matchValences=False,
>                      ringMatchesRingOnly=True,
>                      completeRingsOnly=False,
>                      matchChiralTag=False)
> 
> In [25]: mcs.smartsString
> Out[25]: '[#6]-[#6]1:[#7,#6]:[#6](-[#8]):[#6]:[#6]:[#6]:1'
> 
> The interesting atom here is the third one. When the MCS is found the 
> identity of the atoms is ignored while determining whether or not they match 
> each other, but the actual maximum common substructure of those two molecules 
> has either a N or a C as the third atom. This is what the SMARTS tells you.
> 
> Gaetano, you have found a bug. We'll look into it. 
> 
> -greg
> 
> 
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to