Re: [Rdkit-discuss] Rdkit-discuss Digest, Vol 124, Issue 10

2018-02-07 Thread JW Feng via Rdkit-discuss
How about setting up a donation fund on rdkit.org to pay for summer
students to document code?  For companies that benefited from using RDKit,
it is a worthy cause to pay it forward.

___
JW Feng, Ph.D.
Denali Therapeutics Inc.
151 Oyster Point Blvd, 2nd Floor, South San Francisco, CA 94080 | (650)
270-0628

On Wed, Feb 7, 2018 at 12:24 PM, <
rdkit-discuss-requ...@lists.sourceforge.net> wrote:

> Send Rdkit-discuss mailing list submissions to
> rdkit-discuss@lists.sourceforge.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> or, via email, send a message with subject or body 'help' to
> rdkit-discuss-requ...@lists.sourceforge.net
>
> You can reach the person managing the list at
> rdkit-discuss-ow...@lists.sourceforge.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Rdkit-discuss digest..."
>
>
> Today's Topics:
>
>1. Re: RDKit and Google Summer of Code 2018 (Greg Landrum)
>
>
> --
>
> Message: 1
> Date: Wed, 7 Feb 2018 21:23:46 +0100
> From: Greg Landrum 
> To: Cameron Pye 
> Cc: RDKit Discuss 
> Subject: Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018
> Message-ID:
>  o...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> A quick one on this as part of me digging out from the pile of email I
> should have replied to.
>
> Cameron's suggestion is a really good one, but unfortunately GSoC is really
> about coding projects, so it doesn't work here.
>
> But we should still talk about ways to improve the docs.
>
> I agree that this is a really important task but it's also a bit
> overwhelming and difficult to know where to start. This is too bad since
> it's something you don't need to be a coder to approach; more or less any
> RDKit user could contribute. Believe it or not, just having people point
> out pieces of code that could be (better) documented is already useful -
> I'm sure I'm not the only developer who has forgotten which bits of code
> they've left un(der)documented. I often have 10-15 minute slots of time
> that I could use for writing docs, but it really helps to know which pieces
> should be done first.
>
> I would love to hear suggestions for ways that we can make it easier for
> people to submit improved documentation or pointers to pieces of code that
> could use better documentation and then to let people know that these
> options exist. It needs to be something other than "send email to the list"
> though.
>
> It's currently pretty easy to submit bug reports/feature requests using the
> github interface. These could either provide suggested docs/doc changes or
> point to functions/methods/classes that could be better documented. The
> github guys just added the ability to specify different types of issue
> templates, I could look into doing one of these for documentation requests.
>
> -greg
>
>
>
> On Wed, Jan 24, 2018 at 7:38 PM, Cameron Pye 
> wrote:
>
> >  I know this isn't a particularly sexy job for a budding
> cheminformatician
> > but...
> >
> > Work on the Python documentation!!!
> >
> > I love rdKit and occasionally think I'm pretty savvy but I can't tell you
> > how often I'm scrolling through the documentation (or source) and either:
> >
> > a) discover something that exists but doesn't have anything documentation
> > but the function signature
> > or
> > b) discover some some functionality that exists (and i've wanted) but
> > didn't know it was there!
> >
> > I think this mailing list and Greg do a superb job of keeping the
> > community informed and creating and maintaining the codebase but I think
> > having some more "Pythonic" API documentation would be great.
> >
> > One shining example is the scikit-learn documentation
> >  that has a quick
> > start, tutorials etc.  and then in the well categorized and explanatory
> API
> > ref has links for examples in the User Guide (akin to the "Getting
> Started
> > with the RDKit in Python" doc) .
> >
> > Just my 2 cents!
> >
> > Thanks for all the hard work as always,
> > Cam
> >
> >
> > On Mon, Jan 15, 2018 at 12:52 PM  > sourceforge.net> wrote:
> >
> >> Send Rdkit-discuss mailing list submissions to
> >> rdkit-discuss@lists.sourceforge.net
> >>
> >> To subscribe or unsubscribe via the World Wide Web, visit
> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >> or, via email, send a message with subject or body 'help' to
> >> rdkit-discuss-requ...@lists.sourceforge.net
> >>
> >> You can reach the person managing the list at
> >> rdkit-discuss-ow...@lists.sourceforge.net
> 

Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-02-07 Thread Greg Landrum
[continuing to dig out from the email pile]

On Mon, Jan 15, 2018 at 4:20 PM, Jason Biggs  wrote:

>
>- I've had this on my to-do list for a few months now, implementing
>the algorithm described in this paper.  I think the force-field energy
>minimization routines already present in the RDKit can be utilized for this
>pretty easily.  The only part that I don't think is set up already would be
>applying a constant force to all atoms to force them into the xy plane.
>
> Frączek, T., "Simulation-Based Algorithm for Two-Dimensional Chemical
> Structure Diagram Generation of Complex Molecules and Ligand–Protein
> Interactions." J. Chem. Inf. Model. 2016, 56, 2320-2335, DOI:
> 10.1021/acs.jcim.6b00391.
>
>
Forcing things into the xy plane is pretty simple: you just need to add a
force that tries to drive the z coordinate to zero. The embedding code
already does something analogous when you embed chiral molecules: those are
initially embedded in 4D and then refined by adding a forcefield term that
drives the fourth coordinate to zero.
Here's the bit of code that adds that term:
https://github.com/rdkit/rdkit/blob/master/Code/DistGeom/DistGeomUtils.cpp#L230


-greg


>- Another idea would be to add in point-group symmetry detection.  I'm
>using the Symmetrizer java library, described here
>https://www.ncbi.nlm.nih.gov/pubmed/22549414
>, and pretty happy with
>it overall.  One could re-implement it in C++, or include the jar in the
>External folder and write python wrappers.
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-02-07 Thread Marco Stenta
 +1 to the MolVS project as well.

cheers
marco

2018-01-16 18:19 GMT+01:00 JP :

> Joining the fray, +1 for MolVS
>
> On 16 January 2018 at 16:00, Brian Cole  wrote:
>
>> +1 to the MolVS project as well.
>>
>> Perhaps an easy bite-size project is to incorporate the open source mae
>> parser code into core RDKit: https://github.com/schrodinger/maeparser
>>
>>
>> On Mon, Jan 15, 2018 at 9:08 PM, Francois BERENGER <
>> beren...@bioreg.kyushu-u.ac.jp> wrote:
>>
>>> On 01/16/2018 05:51 AM, Tim Dudgeon wrote:
>>> > Incorporating and "industrialising" Matt's MolVS tautomer and
>>> > standardizer code?
>>> > http://molvs.readthedocs.io/en/latest/index.html
>>>
>>> If we can vote, I would vote for this one.
>>>
>>> > On 15/01/18 07:09, Greg Landrum wrote:
>>> >> Dear all,
>>> >>
>>> >> We've been invited again to participate in the OpenChemistry
>>> >> application for Google Summer of Code.
>>> >>
>>> >> In order to participate we need ideas for projects and mentors to go
>>> >> along with them.
>>> >>
>>> >> The current list of RDKit ideas is being maintained here:
>>> >> http://wiki.openchemistry.org/GSoC_Ideas_2018#RDKit_Project_Ideas
>>> >>
>>> >> (Note: at the point that I'm pressing "send", that's still a copy of
>>> >> last year's project ideas).
>>> >>
>>> >> If you're willing to be a mentor (please ask me about the ~5
>>> >> hours/week required here) or have ideas, please reply to this thread.
>>> >>
>>> >> Best,
>>> >> -greg
>>> >>
>>> >>
>>> >> 
>>> --
>>> >> Check out the vibrant tech community on one of the world's most
>>> >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> >>
>>> >>
>>> >> ___
>>> >> Rdkit-discuss mailing list
>>> >> Rdkit-discuss@lists.sourceforge.net
>>> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> >
>>> >
>>> >
>>> > 
>>> --
>>> > Check out the vibrant tech community on one of the world's most
>>> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> >
>>> >
>>> >
>>> > ___
>>> > Rdkit-discuss mailing list
>>> > Rdkit-discuss@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> >
>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-02-07 Thread Greg Landrum
A quick one on this as part of me digging out from the pile of email I
should have replied to.

Cameron's suggestion is a really good one, but unfortunately GSoC is really
about coding projects, so it doesn't work here.

But we should still talk about ways to improve the docs.

I agree that this is a really important task but it's also a bit
overwhelming and difficult to know where to start. This is too bad since
it's something you don't need to be a coder to approach; more or less any
RDKit user could contribute. Believe it or not, just having people point
out pieces of code that could be (better) documented is already useful -
I'm sure I'm not the only developer who has forgotten which bits of code
they've left un(der)documented. I often have 10-15 minute slots of time
that I could use for writing docs, but it really helps to know which pieces
should be done first.

I would love to hear suggestions for ways that we can make it easier for
people to submit improved documentation or pointers to pieces of code that
could use better documentation and then to let people know that these
options exist. It needs to be something other than "send email to the list"
though.

It's currently pretty easy to submit bug reports/feature requests using the
github interface. These could either provide suggested docs/doc changes or
point to functions/methods/classes that could be better documented. The
github guys just added the ability to specify different types of issue
templates, I could look into doing one of these for documentation requests.

-greg



On Wed, Jan 24, 2018 at 7:38 PM, Cameron Pye  wrote:

>  I know this isn't a particularly sexy job for a budding cheminformatician
> but...
>
> Work on the Python documentation!!!
>
> I love rdKit and occasionally think I'm pretty savvy but I can't tell you
> how often I'm scrolling through the documentation (or source) and either:
>
> a) discover something that exists but doesn't have anything documentation
> but the function signature
> or
> b) discover some some functionality that exists (and i've wanted) but
> didn't know it was there!
>
> I think this mailing list and Greg do a superb job of keeping the
> community informed and creating and maintaining the codebase but I think
> having some more "Pythonic" API documentation would be great.
>
> One shining example is the scikit-learn documentation
>  that has a quick
> start, tutorials etc.  and then in the well categorized and explanatory API
> ref has links for examples in the User Guide (akin to the "Getting Started
> with the RDKit in Python" doc) .
>
> Just my 2 cents!
>
> Thanks for all the hard work as always,
> Cam
>
>
> On Mon, Jan 15, 2018 at 12:52 PM  sourceforge.net> wrote:
>
>> Send Rdkit-discuss mailing list submissions to
>> rdkit-discuss@lists.sourceforge.net
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> or, via email, send a message with subject or body 'help' to
>> rdkit-discuss-requ...@lists.sourceforge.net
>>
>> You can reach the person managing the list at
>> rdkit-discuss-ow...@lists.sourceforge.net
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Rdkit-discuss digest..."
>>
>>
>> Today's Topics:
>>
>>1. Re: RDKit and Google Summer of Code 2018 (Jason Biggs)
>>2. Re: RDKit and Google Summer of Code 2018 (Tim Dudgeon)
>>3. Re: RDKit and Google Summer of Code 2018 (Tim Dudgeon)
>>
>>
>> --
>>
>> Message: 1
>> Date: Mon, 15 Jan 2018 09:20:48 -0600
>> From: Jason Biggs 
>> To: Greg Landrum 
>> Cc: RDKit Discuss 
>> Subject: Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018
>> Message-ID:
>> 

[Rdkit-discuss] Installation of rdkit using conda (proxy) failed

2018-02-07 Thread Guillaume GODIN
Dear All,

I've try to install rdkit using conda (anaconda2) with proxy settings.

I hack a little the conda python files to have some extra prints to see what is 
not working:

Unfortunately there is an issue:


MacBook-Pro:Github GVALMTGG$ conda install -c rdkit rdkit

ArgumentParser(prog='conda', usage=None, description=u'conda is a tool for 
managing and deploying applications, environments and packages.\n\nOptions:\n', 
version=None, formatter_class=, 
conflict_handler='error', add_help=False)

Namespace(alt_hint=False, channel=[u'rdkit'], 
channel_priority=, 
clobber=, cmd=u'install', 
copy=, 
debug=, dry_run=False, 
file=[], force=, 
force_pscheck=False, func=, 
insecure=, 
json=, mkdir=False, 
name=None, no_deps=False, offline=, override_channels=False, packages=[u'rdkit'], pinned=True, 
prefix=None, quiet=, 
revision=None, show_channel_urls=, unknown=False, update_deps=, use_index_cache=False, use_local=False, 
verbosity=, 
yes=)

install execution now

Fetching package metadata ...{'http': None, 'https': None}

..

Solving package specifications: .


Package plan for installation in environment /Users/GVALMTGG/anaconda2:


The following NEW packages will be INSTALLED:


boost:  1.63.0-py27_1 rdkit

cairo:  1.14.10-h913ea44_6

fontconfig: 2.12.4-hffb9db1_2

pixman: 0.34.0-hca0a616_3

rdkit:  2017.09.3.0-py27_1rdkit


The following packages will be UPDATED:


conda:  4.3.30-py27h407ed3a_0   --> 4.4.8-py27_0

pycosat:0.6.2-py27h085d4cc_0--> 0.6.3-py27h6c51c7e_0


Proceed ([y]/n)? y


{'http': None, 'https': None}

{'http': None, 'https': None}

{'http': None, 'https': None}

An unexpected error has occurred.

Please consider posting the following information to the

conda GitHub issue tracker at:


https://github.com/conda/conda/issues




Current conda install:


   platform : osx-64

  conda version : 4.3.30

   conda is private : False

  conda-env version : 4.3.30

conda-build version : 3.0.27

 python version : 2.7.14.final.0

   requests version : 2.18.4

   root environment : /Users/GVALMTGG/anaconda2  (writable)

default environment : /Users/GVALMTGG/anaconda2

   envs directories : /Users/GVALMTGG/anaconda2/envs

  /Users/GVALMTGG/.conda/envs

  package cache : /Users/GVALMTGG/anaconda2/pkgs

  /Users/GVALMTGG/.conda/pkgs

   channel URLs : https://conda.anaconda.org/rdkit/osx-64

  https://conda.anaconda.org/rdkit/noarch

  https://repo.continuum.io/pkgs/main/osx-64

  https://repo.continuum.io/pkgs/main/noarch

  https://repo.continuum.io/pkgs/free/osx-64

  https://repo.continuum.io/pkgs/free/noarch

  https://repo.continuum.io/pkgs/r/osx-64

  https://repo.continuum.io/pkgs/r/noarch

  https://repo.continuum.io/pkgs/pro/osx-64

  https://repo.continuum.io/pkgs/pro/noarch

config file : /Users/GVALMTGG/.condarc

 netrc file : None

   offline mode : False

 user-agent : conda/4.3.30 requests/2.18.4 CPython/2.7.14 
Darwin/16.7.0 OSX/10.12.6

UID:GID : 503:20


`$ /Users/GVALMTGG/anaconda2/bin/conda install -c rdkit rdkit`





Traceback (most recent call last):

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/exceptions.py", 
line 640, in conda_exception_handler

return_value = func(*args, **kwargs)

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/cli/main.py", line 
140, in _main

exit_code = args.func(args, p)

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/cli/main_install.py",
 line 81, in execute

install(args, parser, 'install')

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/cli/install.py", 
line 326, in install

execute_actions(actions, index, verbose=not context.quiet)

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/plan.py", line 
828, in execute_actions

execute_instructions(plan, index, verbose)

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/instructions.py", 
line 247, in execute_instructions

cmd(state, arg)

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/instructions.py", 
line 100, in PROGRESSIVEFETCHEXTRACT_CMD

progressive_fetch_extract.execute()

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/core/package_cache.py",
 line 492, in execute

self._execute_action(action)

  File 
"/Users/GVALMTGG/anaconda2/lib/python2.7/site-packages/conda/core/package_cache.py",
 line 513, in _execute_action

raise 

Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation

2018-02-07 Thread Greg Landrum
On Wed, Feb 7, 2018 at 4:36 PM, Stephen Pickett 
wrote:

>
>
> Thanks for taking a look.
>
>
If you want to keep an eye on what's going on, here's the bug:
https://github.com/rdkit/rdkit/issues/1734


> FYI, I hope to include a section about how we are using this algorithm at
> the UK QSAR meeting in Cardiff in April.
>
>
It should all work as long as you stick to the reactions...

It would be great if you could share the slides when you've got that
presentation put together!

-greg
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation

2018-02-07 Thread Stephen Pickett
Hi Greg

Thanks for taking a look.
FYI, I hope to include a section about how we are using this algorithm at the 
UK QSAR meeting in Cardiff in April.

Stephen

From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 07 February 2018 15:27
To: Stephen Pickett 
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation


EXTERNAL
It's no fair reviving old items on difficult topics like stereochemistry! ;-)

This is due to a bug in BRICS.BreakBRICSBonds(): stereochemistry isn't handled 
correctly.
I have to admit that I'm surprised by this: I expected that this code would 
behave properly, but it clearly doesn't. That's a bug for me to look into.

Your other approach, using BRICS.BRICSDecompose(), uses a different the 
ChemicalReaction machinery to fragment the molecules. This does a better job of 
handling stereochemistry.

Thanks for pointing this out and sorry for the quite-delayed reply.

-greg
p.s. in my reply when this thread originally came up I said that 
BRICSDecompose() uses BreakBRICSBonds(), this is incorrect... I wrote that 
email too quickly.



On Wed, Jan 10, 2018 at 3:15 PM, Stephen Pickett 
> wrote:
Hi

Coming back to this thread as I have found a similar issue with rdkit 17-03/09.

BRICS.BreakBRICSBonds is inverting stereochemistry for some inputs.

>>> smi='CNc1c1[C@H](C)NC'
>>> mol=Chem.MolFromSmiles(smi)
# we are using rdkit canonicalized smiles
>>> Chem.MolToSmiles(mol,1)
'CNc1c1[C@H](C)NC'
>>> frags=BRICS.BRICSDecompose(mol,returnMols=True)
>>> bm=list(BRICS.BRICSBuild(frags))
# input is the first molecule in the list
>>> [Chem.MolToSmiles(m,1) for m in bm]
['CNc1c1[C@H](C)NC', 'CN[C@@H](C)c1c1[C@H](C)NC', 
'CNc1c1-c1c1[C@H](C)NC', 'CNc1c1NC', 'CNc1c1-c1c1NC', 
'CNc1c1-c1c1-c1c1NC']
>>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>>> bm=list(BRICS.BRICSBuild(frags))
# input is the second in the list with inverted stereochem
>>> [Chem.MolToSmiles(m,1) for m in bm]
['CNc1c1NC', 'CNc1c1[C@@H](C)NC', 'CNc1c1-c1c1NC', 
'CNc1c1-c1c1-c1c1NC', 'CNc1c1-c1c1[C@@H](C)NC', 
'CN[C@H](C)c1c1[C@@H](C)NC']

Interestingly, if I make a small change to the molecule
'COc1c1[C@H](C)NC'
Using the smiles as written gives the same issue.

>>> mol=Chem.MolFromSmiles('COc1c1[C@H](C)NC')
>>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>>> bm=list(BRICS.BRICSBuild(frags))
>>> [Chem.MolToSmiles(m,1) for m in bm]
[…, 'CN[C@H](C)c1c1OC', …]

However, this is not the RDKit canonical atom ordering for this molecule.
If I use the RDKit canonical smiles to build the molecule 
('CN[C@@H](C)c1c1OC'), BreakBRICSBonds works fine and I can regenerate the 
initial molecule with BRICSBuild.

>>> mol=Chem.MolFromSmiles('CN[C@@H](C)c1c1OC')
>>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>>> bm=list(BRICS.BRICSBuild(frags))
>>> [Chem.MolToSmiles(m,1) for m in bm]
[…., 'CN[C@@H](C)c1c1OC', ….]

Regards

Stephen

From: Stephen Pickett
Sent: 16 May 2017 09:01
To: Greg Landrum >
Cc: 
rdkit-discuss@lists.sourceforge.net
Subject: RE: [Rdkit-discuss] Differences in chirality with BRICS fragmentation

Thanks Greg

I’m hoping we can get to 17-03

Stephen

From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 16 May 2017 06:22
To: Stephen Pickett
Cc: 
rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation


EXTERNAL
Hi Stephen,

You're perfectly correct, what you're seeing there is a bug. However you're 
using a two-year old version of the RDKit and a number of bugs in this area 
have been fixed in the intervening time. Still, since there's potentially a lot 
going on here, and I'm always nervous about chirality, I will walk through the 
steps I took to figure out whether or not things work properly now for this 
case.

Let's start with making sure that the fragmentation work correctly:
In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC')

In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1']

In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC')

In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1']

Those both look ok, but we should try another input SMILES for the same 
molecule to make sure it's still ok:

In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')

In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True)

In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
Out[26]: ['[5*]NC', 

Re: [Rdkit-discuss] Differences in chirality with BRICS fragmentation

2018-02-07 Thread Greg Landrum
It's no fair reviving old items on difficult topics like stereochemistry!
;-)

This is due to a bug in BRICS.BreakBRICSBonds(): stereochemistry isn't
handled correctly.
I have to admit that I'm surprised by this: I expected that this code would
behave properly, but it clearly doesn't. That's a bug for me to look into.

Your other approach, using BRICS.BRICSDecompose(), uses a different the
ChemicalReaction machinery to fragment the molecules. This does a better
job of handling stereochemistry.

Thanks for pointing this out and sorry for the quite-delayed reply.

-greg
p.s. in my reply when this thread originally came up I said that
BRICSDecompose() uses BreakBRICSBonds(), this is incorrect... I wrote that
email too quickly.



On Wed, Jan 10, 2018 at 3:15 PM, Stephen Pickett 
wrote:

> Hi
>
>
>
> Coming back to this thread as I have found a similar issue with rdkit
> 17-03/09.
>
>
>
> BRICS.BreakBRICSBonds is inverting stereochemistry for some inputs.
>
>
>
> >>> smi='CNc1c1[C@H](C)NC'
>
> >>> mol=Chem.MolFromSmiles(smi)
>
> # we are using rdkit canonicalized smiles
>
> >>> Chem.MolToSmiles(mol,1)
>
> 'CNc1c1[C@H](C)NC'
>
> >>> frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> # input is the first molecule in the list
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CNc1c1[C@H](C)NC', 'CN[C@@H](C)c1c1[C@H](C)NC',
> 'CNc1c1-c1c1[C@H](C)NC', 'CNc1c1NC', 'CNc1c1-c1c1NC',
> 'CNc1c1-c1c1-c1c1NC']
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> # input is the second in the list with inverted stereochem
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> ['CNc1c1NC', 'CNc1c1[C@@H](C)NC', 'CNc1c1-c1c1NC',
> 'CNc1c1-c1c1-c1c1NC', 'CNc1c1-c1c1[C@@H](C)NC',
> 'CN[C@H](C)c1c1[C@@H](C)NC']
>
>
>
> Interestingly, if I make a small change to the molecule
>
> 'COc1c1[C@H](C)NC'
>
> Using the smiles as written gives the same issue.
>
>
>
> >>> mol=Chem.MolFromSmiles('COc1c1[C@H](C)NC')
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> […, 'CN[C@H](C)c1c1OC', …]
>
>
>
> However, this is not the RDKit canonical atom ordering for this molecule.
>
> If I use the RDKit canonical smiles to build the molecule 
> ('CN[C@@H](C)c1c1OC'),
> BreakBRICSBonds works fine and I can regenerate the initial molecule with
> BRICSBuild.
>
>
>
> >>> mol=Chem.MolFromSmiles('CN[C@@H](C)c1c1OC')
>
> >>> frags=Chem.GetMolFrags(BRICS.BreakBRICSBonds(mol),asMols=True)
>
> >>> bm=list(BRICS.BRICSBuild(frags))
>
> >>> [Chem.MolToSmiles(m,1) for m in bm]
>
> […., 'CN[C@@H](C)c1c1OC', ….]
>
>
>
> Regards
>
>
>
> Stephen
>
>
>
> *From:* Stephen Pickett
> *Sent:* 16 May 2017 09:01
> *To:* Greg Landrum 
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* RE: [Rdkit-discuss] Differences in chirality with BRICS
> fragmentation
>
>
>
> Thanks Greg
>
>
>
> I’m hoping we can get to 17-03
>
>
>
> Stephen
>
>
>
> *From:* Greg Landrum [mailto:greg.land...@gmail.com
> ]
> *Sent:* 16 May 2017 06:22
> *To:* Stephen Pickett
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] Differences in chirality with BRICS
> fragmentation
>
>
>
> *EXTERNAL*
>
> Hi Stephen,
>
>
>
> You're perfectly correct, what you're seeing there is a bug. However
> you're using a two-year old version of the RDKit and a number of bugs in
> this area have been fixed in the intervening time. Still, since there's
> potentially a lot going on here, and I'm always nervous about chirality, I
> will walk through the steps I took to figure out whether or not things work
> properly now for this case.
>
>
>
> Let's start with making sure that the fragmentation work correctly:
>
> In [18]: mol=Chem.MolFromSmiles('C1CCOC[C@H]1NC')
>
>
>
> In [19]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [20]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
>
> Out[20]: ['[5*]NC', '[15*][C@H]1CCCOC1']
>
>
>
> In [21]: mol=Chem.MolFromSmiles('C1CCOC[C@@H]1NC')
>
>
>
> In [22]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [23]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
>
> Out[23]: ['[5*]NC', '[15*][C@@H]1CCCOC1']
>
>
>
> Those both look ok, but we should try another input SMILES for the same
> molecule to make sure it's still ok:
>
>
>
> In [24]: mol=Chem.MolFromSmiles('CN[C@H]1CCCOC1')
>
>
>
> In [25]: frags=BRICS.BRICSDecompose(mol,returnMols=True)
>
>
>
> In [26]: [Chem.MolToSmiles(x,isomericSmiles=True) for x in frags]
>
> Out[26]: ['[5*]NC', '[15*][C@H]1CCCOC1']
>
>
>
> Just to be really sure, let's reorder the bonds at the chiral center
> again, making sure to keep the same stereochemistry:
>
> In [27]: