Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation
On Wed, Feb 7, 2018 at 4:36 PM, Stephen Pickettwrote: > > > Thanks for taking a look. > > If you want to keep an eye on what's going on, here's the bug: https://github.com/rdkit/rdkit/issues/1734 > FYI, I hope to include a section about how we are using this algorithm at > the UK QSAR meeting in Cardiff in April. > > It should all work as long as you stick to the reactions... It would be great if you could share the slides when you've got that presentation put together! -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation
Hi Greg Thanks for taking a look. FYI, I hope to include a section about how we are using this algorithm at the UK QSAR meeting in Cardiff in April. Stephen From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 07 February 2018 15:27 To: Stephen Pickett <stephen.d.pick...@gsk.com> Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation EXTERNAL It's no fair reviving old items on difficult topics like stereochemistry! ;-) This is due to a bug in BRICS.BreakBRICSBonds(): stereochemistry isn't handled correctly. I have to admit that I'm surprised by this: I expected that this code would behave properly, but it clearly doesn't. That's a bug for me to look into. Your other approach, using BRICS.BRICSDecompose(), uses a different the ChemicalReaction machinery to fragment the molecules. This does a better job of handling stereochemistry. Thanks for pointing this out and sorry for the quite-delayed reply. -greg p.s. in my reply when this thread originally came up I said that BRICSDecompose() uses BreakBRICSBonds(), this is incorrect... I wrote that email too quickly. On Wed, Jan 10, 2018 at 3:15 PM, Stephen Pickett <stephen.d.pick...@gsk.com<mailto:stephen.d.pick...@gsk.com>> wrote: Hi Coming back to this thread as I have found a similar issue with rdkit 17-03/09. BRICS.BreakBRICSBonds is inverting stereochemistry for some inputs. >>> smi='CNc1c1[C@H](C)NC' >>> mol=Chem.MolFromSmiles(smi) # we are using rdkit canonicalized smiles >>> Chem.MolToSmiles(mol,1) 'CNc1c1[C@H](C)NC' >>> frags=BRICS.BRICSDecompose(mol,returnMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) # input is the first molecule in the list >>> [Chem.MolToSmiles(m,1) for m in bm] ['CNc1c1[C@H](C)NC', 'CN[C@@H](C)c1c1[C@H](C)NC', 'CNc1c1-c1c1[C@H](C)NC', 'CNc1c1NC', 'CNc1c1-c1c1NC', 'CNc1c1-c1c1-c1c1NC'] >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) # input is the second in the list with inverted stereochem >>> [Chem.MolToSmiles(m,1) for m in bm] ['CNc1c1NC', 'CNc1c1[C@@H](C)NC', 'CNc1c1-c1c1NC', 'CNc1c1-c1c1-c1c1NC', 'CNc1c1-c1c1[C@@H](C)NC', 'CN[C@H](C)c1c1[C@@H](C)NC'] Interestingly, if I make a small change to the molecule 'COc1c1[C@H](C)NC' Using the smiles as written gives the same issue. >>> mol=Chem.MolFromSmiles('COc1c1[C@H](C)NC') >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) >>> [Chem.MolToSmiles(m,1) for m in bm] […, 'CN[C@H](C)c1c1OC', …] However, this is not the RDKit canonical atom ordering for this molecule. If I use the RDKit canonical smiles to build the molecule ('CN[C@@H](C)c1c1OC'), BreakBRICSBonds works fine and I can regenerate the initial molecule with BRICSBuild. >>> mol=Chem.MolFromSmiles('CN[C@@H](C)c1c1OC') >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) >>> [Chem.MolToSmiles(m,1) for m in bm] […., 'CN[C@@H](C)c1c1OC', ….] Regards Stephen From: Stephen Pickett Sent: 16 May 2017 09:01 To: Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>> Cc: rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net> Subject: RE: [Rdkit-discuss] Differences in chirality with BRICS fragmentation Thanks Greg I’m hoping we can get to 17-03 Stephen From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 16 May 2017 06:22 To: Stephen Pickett Cc: rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net> Subject: Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation EXTERNAL Hi Stephen, You're perfectly correct, what you're seeing there is a bug. However you're using a two-year old version of the RDKit and a number of bugs in this area have been fixed in the intervening time. Still, since there's potentially a lot going on here, and I'm always nervous about chirality, I will walk through the steps I took to figure out whether or not things work properly now for this case. Let's start with making sure that the fragmentation work correctly: In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC') In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1'] In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC') In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1'] Those both look ok, but we should try another input SMILES for the same molecule to make sure it's still ok: In [24]: m
Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation
It's no fair reviving old items on difficult topics like stereochemistry! ;-) This is due to a bug in BRICS.BreakBRICSBonds(): stereochemistry isn't handled correctly. I have to admit that I'm surprised by this: I expected that this code would behave properly, but it clearly doesn't. That's a bug for me to look into. Your other approach, using BRICS.BRICSDecompose(), uses a different the ChemicalReaction machinery to fragment the molecules. This does a better job of handling stereochemistry. Thanks for pointing this out and sorry for the quite-delayed reply. -greg p.s. in my reply when this thread originally came up I said that BRICSDecompose() uses BreakBRICSBonds(), this is incorrect... I wrote that email too quickly. On Wed, Jan 10, 2018 at 3:15 PM, Stephen Pickett <stephen.d.pick...@gsk.com> wrote: > Hi > > > > Coming back to this thread as I have found a similar issue with rdkit > 17-03/09. > > > > BRICS.BreakBRICSBonds is inverting stereochemistry for some inputs. > > > > >>> smi='CNc1c1[C@H](C)NC' > > >>> mol=Chem.MolFromSmiles(smi) > > # we are using rdkit canonicalized smiles > > >>> Chem.MolToSmiles(mol,1) > > 'CNc1c1[C@H](C)NC' > > >>> frags=BRICS.BRICSDecompose(mol,returnMols=True) > > >>> bm=list(BRICS.BRICSBuild(frags)) > > # input is the first molecule in the list > > >>> [Chem.MolToSmiles(m,1) for m in bm] > > ['CNc1c1[C@H](C)NC', 'CN[C@@H](C)c1c1[C@H](C)NC', > 'CNc1c1-c1c1[C@H](C)NC', 'CNc1c1NC', 'CNc1c1-c1c1NC', > 'CNc1c1-c1c1-c1c1NC'] > > >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) > > >>> bm=list(BRICS.BRICSBuild(frags)) > > # input is the second in the list with inverted stereochem > > >>> [Chem.MolToSmiles(m,1) for m in bm] > > ['CNc1c1NC', 'CNc1c1[C@@H](C)NC', 'CNc1c1-c1c1NC', > 'CNc1c1-c1c1-c1c1NC', 'CNc1c1-c1c1[C@@H](C)NC', > 'CN[C@H](C)c1c1[C@@H](C)NC'] > > > > Interestingly, if I make a small change to the molecule > > 'COc1c1[C@H](C)NC' > > Using the smiles as written gives the same issue. > > > > >>> mol=Chem.MolFromSmiles('COc1c1[C@H](C)NC') > > >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) > > >>> bm=list(BRICS.BRICSBuild(frags)) > > >>> [Chem.MolToSmiles(m,1) for m in bm] > > […, 'CN[C@H](C)c1c1OC', …] > > > > However, this is not the RDKit canonical atom ordering for this molecule. > > If I use the RDKit canonical smiles to build the molecule > ('CN[C@@H](C)c1c1OC'), > BreakBRICSBonds works fine and I can regenerate the initial molecule with > BRICSBuild. > > > > >>> mol=Chem.MolFromSmiles('CN[C@@H](C)c1c1OC') > > >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) > > >>> bm=list(BRICS.BRICSBuild(frags)) > > >>> [Chem.MolToSmiles(m,1) for m in bm] > > […., 'CN[C@@H](C)c1c1OC', ….] > > > > Regards > > > > Stephen > > > > *From:* Stephen Pickett > *Sent:* 16 May 2017 09:01 > *To:* Greg Landrum <greg.land...@gmail.com> > *Cc:* rdkit-discuss@lists.sourceforge.net > *Subject:* RE: [Rdkit-discuss] Differences in chirality with BRICS > fragmentation > > > > Thanks Greg > > > > I’m hoping we can get to 17-03 > > > > Stephen > > > > *From:* Greg Landrum [mailto:greg.land...@gmail.com > <greg.land...@gmail.com>] > *Sent:* 16 May 2017 06:22 > *To:* Stephen Pickett > *Cc:* rdkit-discuss@lists.sourceforge.net > *Subject:* Re: [Rdkit-discuss] Differences in chirality with BRICS > fragmentation > > > > *EXTERNAL* > > Hi Stephen, > > > > You're perfectly correct, what you're seeing there is a bug. However > you're using a two-year old version of the RDKit and a number of bugs in > this area have been fixed in the intervening time. Still, since there's > potentially a lot going on here, and I'm always nervous about chirality, I > will walk through the steps I took to figure out whether or not things work > properly now for this case. > > > > Let's start with making sure that the fragmentation work correctly: > > In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC') > > > > In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True) > > > > In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] > > Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1'] > > > > In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC') > > > > In [22]: frags=BRICS.BRICSDecompose(mol,returnMo
Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation
Hi Coming back to this thread as I have found a similar issue with rdkit 17-03/09. BRICS.BreakBRICSBonds is inverting stereochemistry for some inputs. >>> smi='CNc1c1[C@H](C)NC' >>> mol=Chem.MolFromSmiles(smi) # we are using rdkit canonicalized smiles >>> Chem.MolToSmiles(mol,1) 'CNc1c1[C@H](C)NC' >>> frags=BRICS.BRICSDecompose(mol,returnMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) # input is the first molecule in the list >>> [Chem.MolToSmiles(m,1) for m in bm] ['CNc1c1[C@H](C)NC', 'CN[C@@H](C)c1c1[C@H](C)NC', 'CNc1c1-c1c1[C@H](C)NC', 'CNc1c1NC', 'CNc1c1-c1c1NC', 'CNc1c1-c1c1-c1c1NC'] >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) # input is the second in the list with inverted stereochem >>> [Chem.MolToSmiles(m,1) for m in bm] ['CNc1c1NC', 'CNc1c1[C@@H](C)NC', 'CNc1c1-c1c1NC', 'CNc1c1-c1c1-c1c1NC', 'CNc1c1-c1c1[C@@H](C)NC', 'CN[C@H](C)c1c1[C@@H](C)NC'] Interestingly, if I make a small change to the molecule 'COc1c1[C@H](C)NC' Using the smiles as written gives the same issue. >>> mol=Chem.MolFromSmiles('COc1c1[C@H](C)NC') >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) >>> [Chem.MolToSmiles(m,1) for m in bm] […, 'CN[C@H](C)c1c1OC', …] However, this is not the RDKit canonical atom ordering for this molecule. If I use the RDKit canonical smiles to build the molecule ('CN[C@@H](C)c1c1OC'), BreakBRICSBonds works fine and I can regenerate the initial molecule with BRICSBuild. >>> mol=Chem.MolFromSmiles('CN[C@@H](C)c1c1OC') >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) >>> [Chem.MolToSmiles(m,1) for m in bm] […., 'CN[C@@H](C)c1c1OC', ….] Regards Stephen From: Stephen Pickett Sent: 16 May 2017 09:01 To: Greg Landrum <greg.land...@gmail.com> Cc: rdkit-discuss@lists.sourceforge.net Subject: RE: [Rdkit-discuss] Differences in chirality with BRICS fragmentation Thanks Greg I’m hoping we can get to 17-03 Stephen From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 16 May 2017 06:22 To: Stephen Pickett Cc: rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net> Subject: Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation EXTERNAL Hi Stephen, You're perfectly correct, what you're seeing there is a bug. However you're using a two-year old version of the RDKit and a number of bugs in this area have been fixed in the intervening time. Still, since there's potentially a lot going on here, and I'm always nervous about chirality, I will walk through the steps I took to figure out whether or not things work properly now for this case. Let's start with making sure that the fragmentation work correctly: In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC') In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1'] In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC') In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1'] Those both look ok, but we should try another input SMILES for the same molecule to make sure it's still ok: In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1') In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[26]: ['[5*]NC', '[15*][C@H]1CCCOC1'] Just to be really sure, let's reorder the bonds at the chiral center again, making sure to keep the same stereochemistry: In [27]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1') In [28]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [29]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[29]: ['[5*]NC', '[15*][C@H]1CCCOC1'] That also looks good, so we can have some reasonable confidence that BRICSDecompose() is doing the right thing. BreakBRICSBonds() is used by BRICSDecompose(), so we'd expect that to work too: In [31]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),isomericSmiles=True) Out[31]: '[15*][C@H]1CCCOC1.[5*]NC' and it does. Now let's try putting molecules back together: In [37]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1') In [38]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [39]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in BRICS.BRICSBuild(frags)] Out[39]: ['CN[C@H]1CCCOC1'] That looks ok, what about the other way of writing the SMILES? In [40]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1') In [41]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [42]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in BRIC
Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation
Thanks Greg I’m hoping we can get to 17-03 Stephen From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 16 May 2017 06:22 To: Stephen Pickett Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation EXTERNAL Hi Stephen, You're perfectly correct, what you're seeing there is a bug. However you're using a two-year old version of the RDKit and a number of bugs in this area have been fixed in the intervening time. Still, since there's potentially a lot going on here, and I'm always nervous about chirality, I will walk through the steps I took to figure out whether or not things work properly now for this case. Let's start with making sure that the fragmentation work correctly: In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC') In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1'] In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC') In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1'] Those both look ok, but we should try another input SMILES for the same molecule to make sure it's still ok: In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1') In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[26]: ['[5*]NC', '[15*][C@H]1CCCOC1'] Just to be really sure, let's reorder the bonds at the chiral center again, making sure to keep the same stereochemistry: In [27]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1') In [28]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [29]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[29]: ['[5*]NC', '[15*][C@H]1CCCOC1'] That also looks good, so we can have some reasonable confidence that BRICSDecompose() is doing the right thing. BreakBRICSBonds() is used by BRICSDecompose(), so we'd expect that to work too: In [31]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),isomericSmiles=True) Out[31]: '[15*][C@H]1CCCOC1.[5*]NC' and it does. Now let's try putting molecules back together: In [37]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1') In [38]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [39]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in BRICS.BRICSBuild(frags)] Out[39]: ['CN[C@H]1CCCOC1'] That looks ok, what about the other way of writing the SMILES? In [40]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1') In [41]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [42]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in BRICS.BRICSBuild(frags)] Out[42]: ['CN[C@H]1CCCOC1'] Those also look ok; the bug that was in the older RDKit version has been fixed. I'd really suggest either updating to a newer version of the RDKit yourself or talking to your IT group and asking them to do the update. We can provide help on that here on the mailing list, or if you'd rather do it less publicly, commercial support is available for the RDKit, please contact me at greg.land...@t5informatics.com<mailto:greg.land...@t5informatics.com> to talk about that. Best, -greg On Fri, May 12, 2017 at 10:37 AM, Stephen Pickett <stephen.d.pick...@gsk.com<mailto:stephen.d.pick...@gsk.com>> wrote: Hi I have come across a difference in behaviour with the BRICS algorithms depending on how the molecule is fragmented when using non-canonical smiles input. RDKIT 2015_03, Python 2.7.10 BRICSDecompose gives back the starting chirality >>> smi='C1CCOC[C@H]1NC' >>> mol=Chem.MolFromSmiles(smi) >>> cansmi=Chem.MolToSmiles(mol,1) >>> cansmi 'CN[C@H]1CCCOC1' >>> frags=BRICS.BRICSDecompose(mol,returnMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) >>> [Chem.MolToSmiles(m,1) for m in bm] ['CN[C@H]1CCCOC1'] BreakBRICSBonds inverts the centre. >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) >>> [Chem.MolToSmiles(m,1) for m in bm] ['CN[C@@H]1CCCOC1'] Starting from the canonical smiles works fine >>> smi='CN[C@H]1CCCOC1' >>> mol=Chem.MolFromSmiles(smi) >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) >>> bm=list(BRICS.BRICSBuild(frags)) >>> [Chem.MolToSmiles(m,1) for m in bm] ['CN[C@H]1CCCOC1'] The inversion happens in BreakBRICSBonds >>> smi='C1CCOC[C@H]1NC' >>> mol=Chem.MolFromSmiles(smi) >>> Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),1) '[15*][C@@H]1CCCOC1.[5*]NC' Using the pre canonicalised SMILES is clearly the way to go, but thought that this might be indicative of an issue somewhere. Regards Stephen This e-mail was sent by GlaxoSmithKline Services Unlimited (registered in England and Wales No. 1047315), which is a memb
Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation
Hi Stephen, You're perfectly correct, what you're seeing there is a bug. However you're using a two-year old version of the RDKit and a number of bugs in this area have been fixed in the intervening time. Still, since there's potentially a lot going on here, and I'm always nervous about chirality, I will walk through the steps I took to figure out whether or not things work properly now for this case. Let's start with making sure that the fragmentation work correctly: In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC') In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1'] In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC') In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1'] Those both look ok, but we should try another input SMILES for the same molecule to make sure it's still ok: In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1') In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[26]: ['[5*]NC', '[15*][C@H]1CCCOC1'] Just to be really sure, let's reorder the bonds at the chiral center again, making sure to keep the same stereochemistry: In [27]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1') In [28]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [29]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags] Out[29]: ['[5*]NC', '[15*][C@H]1CCCOC1'] That also looks good, so we can have some reasonable confidence that BRICSDecompose() is doing the right thing. BreakBRICSBonds() is used by BRICSDecompose(), so we'd expect that to work too: In [31]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),isomericSmiles=True) Out[31]: '[15*][C@H]1CCCOC1.[5*]NC' and it does. Now let's try putting molecules back together: In [37]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1') In [38]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [39]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in BRICS.BRICSBuild(frags)] Out[39]: ['CN[C@H]1CCCOC1'] That looks ok, what about the other way of writing the SMILES? In [40]: mol=Chem.MolFromSmiles('CN[C@@H]1COCCC1') In [41]: frags=BRICS.BRICSDecompose(mol,returnMols=True) In [42]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in BRICS.BRICSBuild(frags)] Out[42]: ['CN[C@H]1CCCOC1'] Those also look ok; the bug that was in the older RDKit version has been fixed. I'd really suggest either updating to a newer version of the RDKit yourself or talking to your IT group and asking them to do the update. We can provide help on that here on the mailing list, or if you'd rather do it less publicly, commercial support is available for the RDKit, please contact me at greg.land...@t5informatics.com to talk about that. Best, -greg On Fri, May 12, 2017 at 10:37 AM, Stephen Pickettwrote: > Hi > > > > I have come across a difference in behaviour with the BRICS algorithms > depending on how the molecule is fragmented when using non-canonical smiles > input. > > RDKIT 2015_03, Python 2.7.10 > > > > BRICSDecompose gives back the starting chirality > > > > >>> smi='C1CCOC[C@H]1NC' > > >>> mol=Chem.MolFromSmiles(smi) > > >>> cansmi=Chem.MolToSmiles(mol,1) > > >>> cansmi > > 'CN[C@H]1CCCOC1' > > >>> frags=BRICS.BRICSDecompose(mol,returnMols=True) > > >>> bm=list(BRICS.BRICSBuild(frags)) > > >>> [Chem.MolToSmiles(m,1) for m in bm] > > ['CN[C@H]1CCCOC1'] > > > > BreakBRICSBonds inverts the centre. > > >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) > > >>> bm=list(BRICS.BRICSBuild(frags)) > > >>> [Chem.MolToSmiles(m,1) for m in bm] > > ['CN[C@@H]1CCCOC1'] > > > > Starting from the canonical smiles works fine > > >>> smi='CN[C@H]1CCCOC1' > > >>> mol=Chem.MolFromSmiles(smi) > > >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True) > > >>> bm=list(BRICS.BRICSBuild(frags)) > > >>> [Chem.MolToSmiles(m,1) for m in bm] > > ['CN[C@H]1CCCOC1'] > > > > The inversion happens in BreakBRICSBonds > > >>> smi='C1CCOC[C@H]1NC' > > >>> mol=Chem.MolFromSmiles(smi) > > >>> Chem.MolToSmiles(BRICS.BreakBRICSBonds(mol),1) > > '[15*][C@@H]1CCCOC1.[5*]NC' > > > > Using the pre canonicalised SMILES is clearly the way to go, but thought > that this might be indicative of an issue somewhere. > > > > Regards > > > > Stephen > > -- > > This e-mail was sent by GlaxoSmithKline Services Unlimited > (registered in England and Wales No. 1047315), which is a > member of the GlaxoSmithKline group of companies. The > registered address of GlaxoSmithKline Services Unlimited > is 980 Great West Road, Brentford, Middlesex TW8 9GS. > > *GSK monitors email communications sent to and from GSK in order to > protect GSK, our employees, customers, suppliers and business partners, > from cyber threats and loss of GSK Information. GSK