It's no fair reviving old items on difficult topics like stereochemistry!
;-)

This is due to a bug in BRICS.BreakBRICSBonds(): stereochemistry isn't
handled correctly.
I have to admit that I'm surprised by this: I expected that this code would
behave properly, but it clearly doesn't. That's a bug for me to look into.

Your other approach, using BRICS.BRICSDecompose(), uses a different the
ChemicalReaction machinery to fragment the molecules. This does a better
job of handling stereochemistry.

Thanks for pointing this out and sorry for the quite-delayed reply.

-greg
p.s. in my reply when this thread originally came up I said that
BRICSDecompose() uses BreakBRICSBonds(), this is incorrect... I wrote that
email too quickly.



On Wed, Jan 10, 2018 at 3:15 PM, Stephen Pickett <stephen.d.pick...@gsk.com>
wrote:

> Hi
>
>
>
> Coming back to this thread as I have found a similar issue with rdkit
> 17-03/09.
>
>
>
> BRICS.BreakBRICSBonds is inverting stereochemistry for some inputs.
>
>
>
> >>> smi='CNc1ccccc1[C@H](C)NC'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> # we are using rdkit canonicalized smiles
>
> >>> Chem.MolToSmiles(mol,1)
>
> 'CNc1ccccc1[C@H](C)NC'
>
> >>> frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> # input is the first molecule in the list
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CNc1ccccc1[C@H](C)NC', 'CN[C@@H](C)c1ccccc1[C@H](C)NC',
> 'CNc1ccccc1-c1ccccc1[C@H](C)NC', 'CNc1ccccc1NC', 'CNc1ccccc1-c1ccccc1NC',
> 'CNc1ccccc1-c1ccccc1-c1ccccc1NC']
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> # input is the second in the list with inverted stereochem
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CNc1ccccc1NC', 'CNc1ccccc1[C@@H](C)NC', 'CNc1ccccc1-c1ccccc1NC',
> 'CNc1ccccc1-c1ccccc1-c1ccccc1NC', 'CNc1ccccc1-c1ccccc1[C@@H](C)NC',
> 'CN[C@H](C)c1ccccc1[C@@H](C)NC']
>
>
>
> Interestingly, if I make a small change to the molecule
>
> 'COc1ccccc1[C@H](C)NC'
>
> Using the smiles as written gives the same issue.
>
>
>
> >>> mol=Chem.MolFromSmiles('COc1ccccc1[C@H](C)NC')
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> […, 'CN[C@H](C)c1ccccc1OC', …]
>
>
>
> However, this is not the RDKit canonical atom ordering for this molecule.
>
> If I use the RDKit canonical smiles to build the molecule 
> ('CN[C@@H](C)c1ccccc1OC'),
> BreakBRICSBonds works fine and I can regenerate the initial molecule with
> BRICSBuild.
>
>
>
> >>> mol=Chem.MolFromSmiles('CN[C@@H](C)c1ccccc1OC')
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> […., 'CN[C@@H](C)c1ccccc1OC', ….]
>
>
>
> Regards
>
>
>
> Stephen
>
>
>
> *From:* Stephen Pickett
> *Sent:* 16 May 2017 09:01
> *To:* Greg Landrum <greg.land...@gmail.com>
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* RE: [Rdkit-discuss] Differences in chirality with BRICS
> fragmentation
>
>
>
> Thanks Greg
>
>
>
> I’m hoping we can get to 17-03
>
>
>
> Stephen
>
>
>
> *From:* Greg Landrum [mailto:greg.land...@gmail.com
> <greg.land...@gmail.com>]
> *Sent:* 16 May 2017 06:22
> *To:* Stephen Pickett
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] Differences in chirality with BRICS
> fragmentation
>
>
>
> *EXTERNAL*
>
> Hi Stephen,
>
>
>
> You're perfectly correct, what you're seeing there is a bug. However
> you're using a two-year old version of the RDKit and a number of bugs in
> this area have been fixed in the intervening time. Still, since there's
> potentially a lot going on here, and I'm always nervous about chirality, I
> will walk through the steps I took to figure out whether or not things work
> properly now for this case.
>
>
>
> Let's start with making sure that the fragmentation work correctly:
>
> In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC')
>
>
>
> In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
>
> Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1']
>
>
>
> In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC')
>
>
>
> In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
>
> Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1']
>
>
>
> Those both look ok, but we should try another input SMILES for the same
> molecule to make sure it's still ok:
>
>
>
> In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')
>
>
>
> In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
>
> Out[26]: ['[5*]NC', '[15*][C@H]1CCCOC1']
>
>
>
> Just to be really sure, let's reorder the bonds at the chiral center
> again, making sure to keep the same stereochemistry:
>
> In [27]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1')
>
>
>
> In [28]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [29]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
>
> Out[29]: ['[5*]NC', '[15*][C@H]1CCCOC1']
>
>
>
> That also looks good, so we can have some reasonable confidence that
> BRICSDecompose() is doing the right thing.
>
>
>
> BreakBRICSBonds() is used by BRICSDecompose(), so we'd expect that to work
> too:
>
>
>
> In [31]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),isomericSmiles=True)
>
> Out[31]: '[15*][C@H]1CCCOC1.[5*]NC'
>
>
>
> and it does.
>
>
>
> Now let's try putting molecules back together:
>
>
>
> In [37]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')
>
>
>
> In [38]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [39]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in
> BRICS.BRICSBuild(frags)]
>
> Out[39]: ['CN[C@H]1CCCOC1']
>
>
>
> That looks ok, what about the other way of writing the SMILES?
>
>
>
> In [40]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1')
>
>
>
> In [41]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [42]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in
> BRICS.BRICSBuild(frags)]
>
> Out[42]: ['CN[C@H]1CCCOC1']
>
>
>
> Those also look ok; the bug that was in the older RDKit version has been
> fixed. I'd really suggest either updating to a newer version of the RDKit
> yourself or talking to your IT group and asking them to do the update. We
> can provide help on that here on the mailing list, or if you'd rather do it
> less publicly, commercial support is available for the RDKit, please
> contact me at greg.land...@t5informatics.com to talk about that.
>
>
>
> Best,
>
> -greg
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, May 12, 2017 at 10:37 AM, Stephen Pickett <
> stephen.d.pick...@gsk.com> wrote:
>
> Hi
>
>
>
> I have come across a difference in behaviour with the BRICS algorithms
> depending on how the molecule is fragmented when using non-canonical smiles
> input.
>
> RDKIT 2015_03, Python 2.7.10
>
>
>
> BRICSDecompose gives back the starting chirality
>
>
>
> >>> smi='C1CCOC[C@H]1NC'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> >>> cansmi=Chem.MolToSmiles(mol,1)
>
> >>> cansmi
>
> 'CN[C@H]1CCCOC1'
>
> >>> frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CN[C@H]1CCCOC1']
>
>
>
> BreakBRICSBonds inverts the centre.
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CN[C@@H]1CCCOC1']
>
>
>
> Starting from the canonical smiles works fine
>
> >>> smi='CN[C@H]1CCCOC1'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CN[C@H]1CCCOC1']
>
>
>
> The inversion happens in BreakBRICSBonds
>
> >>> smi='C1CCOC[C@H]1NC'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> >>> Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),1)
>
> '[15*][C@@H]1CCCOC1.[5*]NC'
>
>
>
> Using the pre canonicalised SMILES is clearly the way to go, but thought
> that this might be indicative of an issue somewhere.
>
>
>
> Regards
>
>
>
> Stephen
>
>
> ------------------------------
>
>
> This e-mail was sent by GlaxoSmithKline Services Unlimited
> (registered in England and Wales No. 1047315), which is a
> member of the GlaxoSmithKline group of companies. The
> registered address of GlaxoSmithKline Services Unlimited
> is 980 Great West Road, Brentford
> <https://maps.google.com/?q=980+Great+West+Road,+Brentford&entry=gmail&source=g>,
> Middlesex TW8 9GS.
>
> *GSK monitors email communications sent to and from GSK in order to
> protect GSK, our employees, customers, suppliers and business partners,
> from cyber threats and loss of GSK Information. GSK monitoring is conducted
> with appropriate confidentiality controls and in accordance with local laws
> and after appropriate consultation.*
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> *GSK monitors email communications sent to and from GSK in order to
> protect GSK, our employees, customers, suppliers and business partners,
> from cyber threats and loss of GSK Information. GSK monitoring is conducted
> with appropriate confidentiality controls and in accordance with local laws
> and after appropriate consultation.*
>
> ------------------------------
>
> This e-mail was sent by GlaxoSmithKline Services Unlimited
> (registered in England and Wales No. 1047315), which is a
> member of the GlaxoSmithKline group of companies. The
> registered address of GlaxoSmithKline Services Unlimited
> is 980 Great West Road, Brentford
> <https://maps.google.com/?q=980+Great+West+Road,+Brentford&entry=gmail&source=g>,
> Middlesex TW8 9GS.
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to