Hi,

This is probably related to the above so I thought I'd post it on this
thread. I am noticing inconsistent behaviour when a molecule created via
SMARTS that contains an 'or' statement has HasSubstructMatch called on it,
as opposed to it being the argument to HasSubstructMatch. A simple example
follows:

> O_or_C = Chem.MolFromSmarts('[O,C]')
> O = Chem.MolFromSmiles('O')
> C = Chem.MolFromSmiles('C')
> O_or_C.HasSubstructMatch(O)
True
> O_or_C.HasSubstructMatch(C)
False
> O.HasSubstructMatch(O_or_C)
True
> C.HasSubstructMatch(O_or_C)
True

We also see:
> C_or_O = Chem.MolFromSmarts('[C,O]')
> C_or_O.HasSubstructMatch(O)
False
> C_or_O.HasSubstructMatch(C)
True

so the order of elements in a SMARTS 'or' statement changes the behaviour,
which is unexpected.

Yours,

Toby Wright

--
InhibOx Ltd


On 5 March 2014 10:10, Christos Kannas <chriskan...@gmail.com> wrote:

> Hi Greg,
>
> Thanks a lot for the explanation.
> It makes things clearer now.
> Well the reason I'm doing SMARTS-SMARTS match is because I would like to
> match functional groups with the reactants in reactions.
>
> Regards,
>
> Christos
>
> Christos Kannas
>
> Researcher
> Ph.D Student
>
> Mob (UK): +44 (0) 7447700937
> Mob (Cyprus): +357 99530608
>
> [image: View Christos Kannas's profile on 
> LinkedIn]<http://cy.linkedin.com/in/christoskannas>
>
>
> On 5 March 2014 04:44, Greg Landrum <greg.land...@gmail.com> wrote:
>
>> Hi Christos,
>>
>>
>> On Tue, Mar 4, 2014 at 3:46 PM, Christos Kannas <chriskan...@gmail.com>wrote:
>>
>>> Hi all,
>>>
>>> Why does the following happen?
>>>
>>> In [1]: from rdkit import Chem
>>> In [2]: from rdkit.Chem import AllChem
>>> In [3]: from rdkit.Chem import Draw
>>>
>>> In [4]: patt = Chem.MolFromSmarts("[CH;D2;!$(C-[!#6;!#1])]=O")
>>>
>>> In [5]: z2 = Chem.MolFromSmarts("[*]-C-C([H])(=O)", 1)
>>> In [6]: print Chem.MolToSmiles(z2)
>>> [*]CC=O
>>> In [7]: print Chem.MolToSmarts(z2)
>>> *-C-[C&!H0]=O
>>> In [9]: z2.HasSubstructMatch(patt)
>>> Out[9]: False
>>>
>>> In [10]: z3 = Chem.MolFromSmiles(Chem.MolToSmiles(z2))
>>> In [11]: print Chem.MolToSmiles(z3)
>>> [*]CC=O
>>> In [12]: print Chem.MolToSmarts(z3)
>>> [*]-[#6]-[#6]=[#8]
>>> In [13]: z3.HasSubstructMatch(patt)
>>> Out[13]: True
>>>
>>> Shouldn't be that z2 and z3 have the same information?
>>>
>>
>> The way SMARTS/SMARTS matches is handled is different than the way
>> SMARTS/SMILES matches works.
>>  The short answer is that when doing a SMARTS/SMARTS match, the RDKit
>> compares the queries to each other; when doing a SMARTS/SMILES match, on
>> the other hand, it checks to see if the atoms in the SMILES molecule match
>> the queries in the SMARTS molecule.
>>
>> A bit longer answer:
>> Molecules built using MolFromSmiles contain Atoms, molecules built using
>> MolFromSmarts contain QueryAtoms. Both atoms and QueryAtoms have a Match()
>> method that takes another Atom or QueryAtom as an argument and returns
>> whether or not the two match.
>> The substructure matching code makes heavy use of this Match() method.
>> QueryAtom.Match(Atom) checks to see if the Atom satisfies the query.
>> QueryAtom.Match(QueryAtom) checks to see if the queries on the atoms are
>> the same. This uses a crude approach that is easy to fool, but I assume
>> that a SMARTS-SMARTS match is not a frequent thing someone wants to do.
>> query-query matching is also not a particularly easy problem to solve in a
>> general way.
>>
>> -greg
>>
>>
>>
>
>
>
> ------------------------------------------------------------------------------
> Subversion Kills Productivity. Get off Subversion & Make the Move to
> Perforce.
> With Perforce, you get hassle-free workflows. Merge that actually works.
> Faster operations. Version large binaries.  Built-in WAN optimization and
> the
> freedom to use Git, Perforce or both. Make the move to Perforce.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to