Thanks for the help, Greg - my reaction SMARTS are now behaving themselves! I
must confess, I had not actually realised that the documentation from install
(ie the 'Book') was different to the 'Getting Started' one that I had linked
from the website.
Kind regards,
James
-----Original Message-----
From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: Fri 04/06/2010 05:13
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Dear James,
On Thu, Jun 3, 2010 at 7:51 PM, James Davidson <j.david...@vernalis.com> wrote:
>
> First of all, I'd like to start by saying how much I've been enjoying
> exploring the functionality of RDKit - great job, Greg!
Thanks!
> I have a couple of questions regarding
> 'rdkit.Chem.AllChem.ReactionFromSmarts':
>
> (1) I see that the reaction objects can be created from MDL Reaction
> Files/Blocks - is there a way to do the reverse, and save a reaction object
> in MDL .rxn format? I tried using investigating the rxn.ToBinary()
> attribute, but didn't get very far... The reason I wanted to do this, is
> that I was trying to figure-out how to generate a form of the reaction
> object (generated from reaction SMARTS) that was suitable for converting
> into a 2D depiction of the transformation.
At the moment the reactions are essentially input-only. There's really
no way to get them out in any format that could be used elsewhere.
This is a sadly missing feature: it would be really nice to be able to
generate either .rxn files (or at least reaction smarts) from
reactions. I will add a feature request for this, but it may take a
while to happen.[1]
A workaround that kind of works is to paste the reaction smarts into
something like Marvin Sketch. It will normally display something that
at least gives some idea of what the reaction is.
> (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some
> behaviour that I would not expect - however, this could be down to my
> SMARTS-naivety; my SMIRKS-naivety; or both!
Anytime reactions behave in ways you don't expect, it's probably best
to just blame me for coming up with yet another way of expressing them
that is slightly incompatible with the existing ones. :-)
> I initially tried the
> following:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
> rxn_smarts =
> '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]'
> sm = Chem.MolFromSmiles('CC(=O)NC')
> rxn = AllChem.ReactionFromSmarts(rxn_smarts)
> prods = rxn.RunReactants((sm,))
> prod = Chem.MolToSmiles(prod[0][0])
>
>
> This gives me prod = '[H]C(=O)NC'
There's a discussion of this kind of case in the "RDKit Book"
($RDBASE/Docs/Book/RDKit_Book.pdf) starting on page 3. The short
answer is that if you have a query feature (atom list, recursive
smarts, etc.) in the reactants and you would like the matching atom to
be copied into the products you should include a dummy for that atom
in the products. A working form of your example is then:
[11]>>> rxn_smarts =
'[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[*:1]-[C:3](=[O:4])-[NH:2]-[*:5]'
[12]>>> rxn = AllChem.ReactionFromSmarts(rxn_smarts)
[13]>>> prods = rxn.RunReactants((Chem.MolFromSmiles('c1ccccc1C(=O)NCC1CC1'),))
[14]>>> Chem.MolToSmiles(prods[0][0])
Out[14] 'O=C(CC1CC1)Nc1ccccc1'
As an aside, in SMARTS it's shorter (and I think clearer) to write
[C,c] as [#6]. It also produces a query that runs a bit quicker, but
you probably won't notice that difference in most cases.
> If I replace with <<rxn_smarts =
> '[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]'>>,
> I get the behaviour I want - with prod = 'CNC(=O)C'. So I think I can get
> the behaviour I want, but was curious if I am using the SMARTS ! operator
> incorrectly in conjunction with atomic numbers, or whether this may be a
> bug?
Not really a bug. The behavior when you have queries in the products
is undocumented: depending on the details of the query it will
sometimes do the right thing, sometimes not. It's much safer to just
use "*". What I probably should do is add a warning message if the
reaction contains a query in the products, I will think about this.
Best Regards,
-greg
[1] The underlying problem isn't actually generating the rxn files
themselves, they are just a collection of mol blocks with a bit of
extra verbiage sprinkled around. The problem is generating reasonable
mol blocks for molecules with query features. I already have a feature
request in for that one
(http://sourceforge.net/tracker/?group_id=160139&atid=814653), but it
turns out to not be quite as easy as it sounds.
______________________________________________________________________
PLEASE READ: This email is confidential and may be privileged. It is intended
for the named addressee(s) only and access to it by anyone else is
unauthorised. If you are not an addressee, any disclosure or copying of the
contents of this email or any action taken (or not taken) in reliance on it is
unauthorised and may be unlawful. If you have received this email in error,
please notify the sender or postmas...@vernalis.com. Email is not a secure
method of communication and the Company cannot accept responsibility for the
accuracy or completeness of this message or any attachment(s). Please check
this email for virus infection for which the Company accepts no responsibility.
If verification of this email is sought then please request a hard copy. Unless
otherwise stated, any views or opinions presented are solely those of the
author and do not represent those of the Company.
The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133
To access trading company registration and address details, please go to the
Vernalis website at www.vernalis.com and click on the "Company address and
registration details" link at the bottom of the page..
______________________________________________________________________
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss