Thanks Greg
I’m hoping we can get to 17-03
Stephen
From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 16 May 2017 06:22
To: Stephen Pickett
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation
EXTERNAL
Hi Stephen,
You're perfectly correct, what you're seeing there is a bug. However you're
using a two-year old version of the RDKit and a number of bugs in this area
have been fixed in the intervening time. Still, since there's potentially a lot
going on here, and I'm always nervous about chirality, I will walk through the
steps I took to figure out whether or not things work properly now for this
case.
Let's start with making sure that the fragmentation work correctly:
In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC')
In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1']
In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC')
In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1']
Those both look ok, but we should try another input SMILES for the same
molecule to make sure it's still ok:
In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')
In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[26]: ['[5*]NC', '[15*][C@H]1CCCOC1']
Just to be really sure, let's reorder the bonds at the chiral center again,
making sure to keep the same stereochemistry:
In [27]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1')
In [28]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
In [29]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[29]: ['[5*]NC', '[15*][C@H]1CCCOC1']
That also looks good, so we can have some reasonable confidence that
BRICSDecompose() is doing the right thing.
BreakBRICSBonds() is used by BRICSDecompose(), so we'd expect that to work too:
In [31]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),isomericSmiles=True)
Out[31]: '[15*][C@H]1CCCOC1.[5*]NC'
and it does.
Now let's try putting molecules back together:
In [37]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')
In [38]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
In [39]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in
BRICS.BRICSBuild(frags)]
Out[39]: ['CN[C@H]1CCCOC1']
That looks ok, what about the other way of writing the SMILES?
In [40]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1')
In [41]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
In [42]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in
BRICS.BRICSBuild(frags)]
Out[42]: ['CN[C@H]1CCCOC1']
Those also look ok; the bug that was in the older RDKit version has been fixed.
I'd really suggest either updating to a newer version of the RDKit yourself or
talking to your IT group and asking them to do the update. We can provide help
on that here on the mailing list, or if you'd rather do it less publicly,
commercial support is available for the RDKit, please contact me at
greg.land...@t5informatics.com<mailto:greg.land...@t5informatics.com> to talk
about that.
Best,
-greg
On Fri, May 12, 2017 at 10:37 AM, Stephen Pickett
<stephen.d.pick...@gsk.com<mailto:stephen.d.pick...@gsk.com>> wrote:
Hi
I have come across a difference in behaviour with the BRICS algorithms
depending on how the molecule is fragmented when using non-canonical smiles
input.
RDKIT 2015_03, Python 2.7.10
BRICSDecompose gives back the starting chirality
>>> smi='C1CCOC[C@H]1NC'
>>> mol=Chem.MolFromSmiles(smi)
>>> cansmi=Chem.MolToSmiles(mol,1)
>>> cansmi
'CN[C@H]1CCCOC1'
>>> frags=BRICS.BRICSDecompose(mol,returnMols=True)
>>> bm=list(BRICS.BRICSBuild(frags))
>>> [Chem.MolToSmiles(m,1) for m in bm]
['CN[C@H]1CCCOC1']
BreakBRICSBonds inverts the centre.
>>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>>> bm=list(BRICS.BRICSBuild(frags))
>>> [Chem.MolToSmiles(m,1) for m in bm]
['CN[C@@H]1CCCOC1']
Starting from the canonical smiles works fine
>>> smi='CN[C@H]1CCCOC1'
>>> mol=Chem.MolFromSmiles(smi)
>>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>>> bm=list(BRICS.BRICSBuild(frags))
>>> [Chem.MolToSmiles(m,1) for m in bm]
['CN[C@H]1CCCOC1']
The inversion happens in BreakBRICSBonds
>>> smi='C1CCOC[C@H]1NC'
>>> mol=Chem.MolFromSmiles(smi)
>>> Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),1)
'[15*][C@@H]1CCCOC1.[5*]NC'
Using the pre canonicalised SMILES is clearly the way to go, but thought that
this might be indicative of an issue somewhere.
Regards
Stephen
________________________________
This e-mail was sent by GlaxoSmithKline Services Unlimited
(registered in England and Wales No. 1047315), which is a
member of the GlaxoSmithKline group of companies. The
registered address of GlaxoSmithKline Services Unlimited
is 980 Great West Road, Brentford, Middlesex TW8 9GS.
GSK monitors email communications sent to and from GSK in order to protect GSK,
our employees, customers, suppliers and business partners, from cyber threats
and loss of GSK Information. GSK monitoring is conducted with appropriate
confidentiality controls and in accordance with local laws and after
appropriate consultation.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
________________________________
This e-mail was sent by GlaxoSmithKline Services Unlimited
(registered in England and Wales No. 1047315), which is a
member of the GlaxoSmithKline group of companies. The
registered address of GlaxoSmithKline Services Unlimited
is 980 Great West Road, Brentford, Middlesex TW8 9GS.
GSK monitors email communications sent to and from GSK in order to protect GSK,
our employees, customers, suppliers and business partners, from cyber threats
and loss of GSK Information. GSK monitoring is conducted with appropriate
confidentiality controls and in accordance with local laws and after
appropriate consultation.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss