Interesting question. Not sure if it's relevant but in ECBlast we do provide 
Canonicalised reaction labels. I agree with Greg that AAM is important.

https://github.com/asad/ReactionDecoder

http://www.ebi.ac.uk/thornton-srv/software/rbl/

Regards,
Asad

Sent from my iPhone

> On 16 Dec 2016, at 14:42, Stephen Pickett <stephen.d.pick...@gsk.com> wrote:
> 
> Thanks Greg, that’s clear.
>  
> Stephen
>  
> From: Greg Landrum [mailto:greg.land...@gmail.com] 
> Sent: 16 December 2016 14:33
> To: Stephen Pickett
> Cc: rdkit-discuss@lists.sourceforge.net
> Subject: Re: [Rdkit-discuss] Canonicalisation with reaction labels
>  
> EXTERNAL
> 
> Hi Stephen,
>  
> The new canonicalization algorithm intentionally takes the atom-mapping 
> information into account. The logic is that the entire SMILES provided should 
> be canonical, so if the SMILES includes atom maps, those atom maps should be 
> considered while canonicalizing.
>  
> If you have a molecule with atom maps and you would like the canonical SMILES 
> without the maps, you can do this (with the most recent version of the code):
>  
> In [18]: mol = Chem.MolFromSmiles('C1CC([*:1])CCN1')
>  
> In [19]: nmol = Chem.Mol(mol)
>  
> In [20]: for at in nmol.GetAtoms(): at.SetAtomMapNum(0)
>  
> In [21]: Chem.MolToSmiles(mol,True)
> Out[21]: 'C1CC([*:1])CCN1'
>  
> In [22]: Chem.MolToSmiles(nmol,True)
> Out[22]: '[*]C1CCNCC1'
>  
> A somewhat less clear (IMO) way of doing this that works in all versions is:
>  
> In [27]: nmol = Chem.Mol(mol)
>  
> In [28]: for at in nmol.GetAtoms(): at.ClearProp('molAtomMapNumber')
>  
> In [29]: Chem.MolToSmiles(nmol,True)
> Out[29]: '[*]C1CCNCC1'
>  
>  
> I hope this helps,
> -greg
>  
>  
>  
> On Fri, Dec 16, 2016 at 1:55 PM, Stephen Pickett <stephen.d.pick...@gsk.com> 
> wrote:
> Hi
>  
> With a 2013 RDkit install we get consistent canonicalization between reaction 
> labelled and unlabelled atoms.
> >>> mol = Chem.MolFromSmiles('C1CC([*])CCN1')
> >>> Chem.MolToSmiles(mol)
> '[*]C1CCNCC1'
> >>> mol = Chem.MolFromSmiles('C1CC([*:1])CCN1')
> >>> Chem.MolToSmiles(mol)
> '[*:1]C1CCNCC1'
>  
> In 2015-09 we are seeing differences.
> >>> mol = Chem.MolFromSmiles('C1CC([*])CCN1')
> >>> Chem.MolToSmiles(mol)
> '[*]C1CCNCC1'
> >>> mol = Chem.MolFromSmiles('C1CC([*:1])CCN1')
> >>> Chem.MolToSmiles(mol)
> 'C1CC([*:1])CCN1'
>  
> I can understand why canonicalization can be different between versions but 
> I’m not sure whether this change in behaviour is expected?
> I’m afraid that I don’t have ready access to a more recent install to test 
> this out.
>  
> Thanks
>  
> Stephen
>  
> 
> This e-mail was sent by GlaxoSmithKline Services Unlimited
> (registered in England and Wales No. 1047315), which is a
> member of the GlaxoSmithKline group of companies. The
> registered address of GlaxoSmithKline Services Unlimited
> is 980 Great West Road, Brentford, Middlesex TW8 9GS.
> GSK monitors email communications sent to and from GSK in order to protect 
> GSK, our employees, customers, suppliers and business partners, from cyber 
> threats and loss of GSK Information. GSK monitoring is conducted with 
> appropriate confidentiality controls and in accordance with local laws and 
> after appropriate consultation.
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
>  
> 
> 
> This e-mail was sent by GlaxoSmithKline Services Unlimited
> (registered in England and Wales No. 1047315), which is a
> member of the GlaxoSmithKline group of companies. The
> registered address of GlaxoSmithKline Services Unlimited
> is 980 Great West Road, Brentford, Middlesex TW8 9GS.
> GSK monitors email communications sent to and from GSK in order to protect 
> GSK, our employees, customers, suppliers and business partners, from cyber 
> threats and loss of GSK Information. GSK monitoring is conducted with 
> appropriate confidentiality controls and in accordance with local laws and 
> after appropriate consultation.
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most 
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to