Hi Stephen,

You're perfectly correct, what you're seeing there is a bug. However you're
using a two-year old version of the RDKit and a number of bugs in this area
have been fixed in the intervening time. Still, since there's potentially a
lot going on here, and I'm always nervous about chirality, I will walk
through the steps I took to figure out whether or not things work properly
now for this case.

Let's start with making sure that the fragmentation work correctly:

In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC')

In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1']

In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC')

In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1']


Those both look ok, but we should try another input SMILES for the same
molecule to make sure it's still ok:

In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')

In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[26]: ['[5*]NC', '[15*][C@H]1CCCOC1']


Just to be really sure, let's reorder the bonds at the chiral center again,
making sure to keep the same stereochemistry:

In [27]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1')

In [28]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [29]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[29]: ['[5*]NC', '[15*][C@H]1CCCOC1']


That also looks good, so we can have some reasonable confidence that
BRICSDecompose() is doing the right thing.

BreakBRICSBonds() is used by BRICSDecompose(), so we'd expect that to work
too:


In [31]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),isomericSmiles=True)
Out[31]: '[15*][C@H]1CCCOC1.[5*]NC'


and it does.

Now let's try putting molecules back together:

In [37]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')

In [38]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [39]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in
BRICS.BRICSBuild(frags)]
Out[39]: ['CN[C@H]1CCCOC1']


That looks ok, what about the other way of writing the SMILES?

In [40]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1')

In [41]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [42]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in
BRICS.BRICSBuild(frags)]
Out[42]: ['CN[C@H]1CCCOC1']


Those also look ok; the bug that was in the older RDKit version has been
fixed. I'd really suggest either updating to a newer version of the RDKit
yourself or talking to your IT group and asking them to do the update. We
can provide help on that here on the mailing list, or if you'd rather do it
less publicly, commercial support is available for the RDKit, please
contact me at [email protected] to talk about that.

Best,
-greg






On Fri, May 12, 2017 at 10:37 AM, Stephen Pickett <[email protected]
> wrote:

> Hi
>
>
>
> I have come across a difference in behaviour with the BRICS algorithms
> depending on how the molecule is fragmented when using non-canonical smiles
> input.
>
> RDKIT 2015_03, Python 2.7.10
>
>
>
> BRICSDecompose gives back the starting chirality
>
>
>
> >>> smi='C1CCOC[C@H]1NC'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> >>> cansmi=Chem.MolToSmiles(mol,1)
>
> >>> cansmi
>
> 'CN[C@H]1CCCOC1'
>
> >>> frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CN[C@H]1CCCOC1']
>
>
>
> BreakBRICSBonds inverts the centre.
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CN[C@@H]1CCCOC1']
>
>
>
> Starting from the canonical smiles works fine
>
> >>> smi='CN[C@H]1CCCOC1'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CN[C@H]1CCCOC1']
>
>
>
> The inversion happens in BreakBRICSBonds
>
> >>> smi='C1CCOC[C@H]1NC'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> >>> Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),1)
>
> '[15*][C@@H]1CCCOC1.[5*]NC'
>
>
>
> Using the pre canonicalised SMILES is clearly the way to go, but thought
> that this might be indicative of an issue somewhere.
>
>
>
> Regards
>
>
>
> Stephen
>
> ------------------------------
>
> This e-mail was sent by GlaxoSmithKline Services Unlimited
> (registered in England and Wales No. 1047315), which is a
> member of the GlaxoSmithKline group of companies. The
> registered address of GlaxoSmithKline Services Unlimited
> is 980 Great West Road, Brentford, Middlesex TW8 9GS.
>
> *GSK monitors email communications sent to and from GSK in order to
> protect GSK, our employees, customers, suppliers and business partners,
> from cyber threats and loss of GSK Information. GSK monitoring is conducted
> with appropriate confidentiality controls and in accordance with local laws
> and after appropriate consultation.*
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to