Re: [Rdkit-discuss] problems with installation on conda with python 3.5 32-bit

2017-02-20 Thread Greg Landrum
On Tue, Feb 21, 2017 at 4:18 AM, 杨弘宾  wrote:

> Hi Greg,
> I am using 32-bit python27 and anaconda (with 64-bit windows 10). So I
> cannot update to latest version and test  with "abnormal" operation of rdMolDraw2D> as you proposed several days
> ago.
> Since it did not trouble me, I plan to upgrade all this environment in
> the future.
>

Is there any particular reason you are using 32bit python on a 64bit system?

BTW, is it necessary to upgrade python into 3.6 in case that RDkit won't
> support python2. I prefer 2.7 at least for now :)
>

I do not plan to discontinue support for python2 anytime in the foreseeable
future. There are still too many people who have not (or can not without
doing a lot of work) migrate to python 3. The RDKit will most likely
continue to support Python 2.7 for as long as the Python developers support
Python 2.7.

Having said that: anyone who does not have a large legacy base to support
or who is starting a new project *really* should use Python 3. Python 2 is
considered legacy by the Python developers and nothing new has been added
to it since 2010. All current and future development effort is focused on
Python 3 and the Python developers currently plan to stop supporting Python
2 in 2020. Whether you believe that date or not, the longer you wait to
make the move the harder it will be to do so.

-greg
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Greg Landrum
On Mon, Feb 20, 2017 at 6:17 PM, Thomas Evangelidis 
wrote:

>
> Thank you for your useful hints. All the compounds that I want to align
> are supposed to belong to the same analogue series so they should shave a
> common substructure with substantial size.
>

In that case, using an MCS based alignment should work reasonably well,
particularly if you do the MCS of the entire series instead of doing it
pairwise. The approach there would be to find the MCS and then use the
RDKit's RMSD-based alignment (AllChem.AlignMol) where you provide the
atomMap argument to specify the atom--atom mapping for alignment.

Here's a short example of how to do that:

# generate 3d structures (in case you don't have them already):
mhs = [Chem.AddHs(x) for x in mols]
[AllChem.EmbedMolecule(x,AllChem.ETKDG()) for x in mhs]
mols = [Chem.RemoveHs(x) for x in mhs]

# Find the MCS:
from rdkit.Chem import rdFMCS
mcs =
rdFMCS.FindMCS(mols,threshold=0.8,completeRingsOnly=True,ringMatchesRingOnly=True)

# align everything to the first molecule
patt = Chem.MolFromSmarts(mcs.smartsString)
refMol = mols[0]
refMatch = refMol.GetSubstructMatch(patt)
rmsVs = []
for probeMol in mols[1:]:
mv = probeMol.GetSubstructMatch(patt)
rms = AllChem.AlignMol(probeMol,refMol,atomMap=list(zip(mv,refMatch)))
rmsVs.append(rms)


What I want to emulate is the "core restrained docking" with glide, where
> you specify the common core of the query and the reference ligand using a
> SMARTS pattern and then glide docks the query compound to the binding
> pocket but takes care to overlay the core atoms of the query to the core
> atoms of the reference compound. Since RDKit does not do docking, I just
> generate 30 conformers of each query compound and select the best one by
> measuring the RMSD between the core of the query and the core of the
> reference after the alignment. Of course the conformations of the core
> atoms between the query and the reference are never identical hence the bad
> alignment. Is there any smarter way to emulate the "core restrained
> docking" with RDKit?
>

The docking part is not doable in any straightforward way at the moment
since it's hard to take information about the protein into account. There's
an idea for a student summer project to solve this problem floating around,
let's see if that gets funded and we find the right student.

If the goal is to generate a set of conformations where cores are aligned
with each other, this blog post may be interesting:
http://rdkit.blogspot.ch/2013/12/using-allchemconstrainedembed.html

-greg
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problems with installation on conda with python 3.5 32-bit

2017-02-20 Thread 杨弘宾

Hi Greg,    I am using 32-bit python27 and anaconda (with 64-bit windows 10). 
So I cannot update to latest version and test  as you proposed several days ago.     
Since it did not trouble me, I plan to upgrade all this environment in the 
future.
BTW, is it necessary to upgrade python into 3.6 in case that RDkit won't 
support python2. I prefer 2.7 at least for now :)


Hongbin Yang 

 From: Greg LandrumDate: 2017-02-20 23:02To: Michal KrompiecCC: 
rdkit-discuss@lists.sourceforge.netSubject: Re: [Rdkit-discuss] problems with 
installation on conda with python 3.5 32-bitHi Michal,
We've only ever done python2.7 builds for win32 and we stopped doing those with 
the 2016.03 release.I will have to check, but I think I probably can start 
doing these again, but I'm reluctant due to the amount of effort required.How 
many users do you need to support who are stuck on 32bit machines?
-greg

On Mon, Feb 20, 2017 at 2:18 PM, Michal Krompiec  
wrote:
Hello,I can't install rdkit on anaconda with 32-bit python3 on Windows 7.
When I try "the usual", conda tries to install python2.7 into the environment:
>conda create -c rdkit -n my-rdkit-env rdkit
Fetching package metadata .
Solving package specifications: .Package plan for installation in environment 
C:\Anaconda3_32\envs\my-rdkit-env:The following NEW packages will be INSTALLED: 
   boost:  1.56.0-py27_3 rdkit
    bzip2:  1.0.6-vc9_3 [vc9]
    mkl:    2017.0.1-0
    numpy:  1.11.3-py27_0
    pip:    9.0.1-py27_1
    python: 2.7.13-0
    rdkit:  2016.03.1-np111py27_1 rdkit
    setuptools: 27.2.0-py27_1
    vs2008_runtime: 9.00.30729.5054-0
    wheel:  0.29.0-py27_0
    zlib:   1.2.8-vc9_3 [vc9]
If I create an empty environment, load python 3.5 into it and try installing 
rdkit, I get an error:
>conda create -n my-rdkit-env python=3.5
Fetching package metadata ...
Solving package specifications: .Package plan for installation in environment 
C:\Anaconda3_32\envs\my-rdkit-env:The following NEW packages will be INSTALLED: 
   pip:    9.0.1-py35_1
    python: 3.5.2-0
    setuptools: 27.2.0-py35_1
    vs2015_runtime: 14.0.25123-0
    wheel:  0.29.0-py35_0Proceed ([y]/n)?#
# To activate this environment, use:
# > activate my-rdkit-env
#
# To deactivate this environment, use:
# > deactivate my-rdkit-env
#
# * for power-users using bash, you must source
#
>conda install --name my-rdkit-env -f --channel 
>https://conda.anaconda.org/rdkit rdkit
Fetching package metadata .
Solving package specifications: .
UnsatisfiableError: The following specifications were found to be in conflict:
  - python 3.5*
  - rdkit -> python 2.7*
Use "conda info " to see the dependencies for each package.

I managed to install rdkit without any problems on the same machine in 64-bit 
anaconda with python3.5, but I need a separate 32-bit build to support users 
with 32-bit machines. Any help will be appreciated.
Thanks and best regards,
Michal


--

Check out the vibrant tech community on one of the world's most

engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___

Rdkit-discuss mailing list

Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Ling Chan
Hello Thomas,

This publication could be of interest to you. I have not read the whole
paper so I don't know how relevant it is.

Ling

Graph-Based Molecular Alignment (GMA)
Marialke, Korner, Tietze, Apostolakis
J. Chem. Inf. Model. 47 (2007) 591-601


On Mon, Feb 20, 2017 at 3:32 PM, Thomas Evangelidis 
wrote:

> @Peter
> I am working exactly on the scenario you described.
>
> @Brian
> I have found this thread which is pretty similar to my case and to what
> you suggested, so now I am adapting my code accordingly.
>
> https://sourceforge.net/p/rdkit/mailman/message/35034093/
>
> What you have published sounds interesting. I also have multiple
> co-crystal active sites. Could you please send me the link?
> One question: what was more successful in your experience when you had a
> crystal ligand and a series of analogues that you wanted to place into the
> binding pocket? 1) to superpose the MCS between the scaffold of the query
> and the crystal ligand, or the MCS of the whole molecules?
>
>
>
>
> On 20 February 2017 at 21:03, Peter S. Shenkin  wrote:
>
>> With Glide, IIRC, this facility is designed for the use case where the
>> coordinates of a docked ligand are known (typically from an X-ray
>> structure) and the docked ligand shares a SMARTS with the ligands in an
>> input file. The SMARTS-matching atoms of each incoming ligand are
>> superposed upon the corresponding atoms of the docked ligand and the
>> resulting pose is used as an initial guess for the docking.
>>
>> Some notes:
>>
>> 0. Greg questions whether there is really a common core in your example,
>> and if there's not, it doesn't appear as if the procedure is directly
>> applicable. But if it is applicable, read on.
>>
>> 1. If the SMARTS matches in multiple ways, all are tried, and the best
>> docking score among them wins (though there may be a way of requesting the
>> N best scores, or even all of them). So if the SMARTS specifies a phenyl
>> ring, for example, 12 initial poses will be tried. (If it contains two
>> phenyl rings, 24 will be tried)
>>
>> 2. GLIDE itself does conformer generation, but I'm not sure how it works
>> in this procedure. If the SMARTS specifies a rigid core, you probably don't
>> need to pre-generate conformers, but if the core is flexible, you are
>> probably best off generating them, which of course you are permitted to do.
>>
>> 3. If you have GLIDE, then  you probably have LigPrep as well. The
>> advantage of using LigPrep for your conformation generation would be that
>> the strain energy would be written into the output file, and then, when
>> used as the input to Glide, it would be taken into account when computing
>> the docking score. And it uses the same (or a very similar) force-field
>> that Glide itself uses.
>>
>> 4. I may have some details slightly incorrect, so you might want to
>> address your question to Schrödinger tech support.
>>
>> On Mon, Feb 20, 2017 at 2:15 PM, Brian Kelley 
>> wrote:
>>
>>> I don't know the exact glide procedure, but I did write such a system
>>> for OpenEye (POSIT).  The issue you are facing is that the RMSD portion is
>>> just a constraint used for docking, it isn't used as the "score", in fact,
>>> it can't tell if the conformation interpenetrates the active site or which
>>> orientation is better.
>>>
>>> I believe RDKit can generate conformations with a template, see
>>> AllChem.ConstrainedEmbed, this would solve half of your problem in creating
>>> conformations that match your template.  You still have the problem with
>>> scoring against your active site.  POSIT scored against the shape tanimoto
>>> of the active ligands (if any) to try to fill the same space as the known
>>> ligands. See rdkit.Chem.rdShapeHelpers.ShapeTanimotoDist
>>>
>>> This might not be what you want, but we had good success with similar
>>> methods and virtual screening, especially when using multiple co-crystal
>>> active sites.   I can send you a reference link if this interests you
>>>
>>> Cheers,
>>>  Brian
>>>
>>> On Mon, Feb 20, 2017 at 12:17 PM, Thomas Evangelidis 
>>> wrote:
>>>
 ​
 Greg and Brian,

 Thank you for your useful hints. All the compounds that I want to align
 are supposed to belong to the same analogue series so they should shave a
 common substructure with substantial size.

 What I want to emulate is the "core restrained docking" with glide,
 where you specify the common core of the query and the reference ligand
 using a SMARTS pattern and then glide docks the query compound to the
 binding pocket but takes care to overlay the core atoms of the query to the
 core atoms of the reference compound. Since RDKit does not do docking,
 I just generate 30 conformers of each query compound and select the best
 one by measuring the RMSD between the core of the query and the core
 of the reference after the alignment. Of course the 

Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Thomas Evangelidis
@Peter
I am working exactly on the scenario you described.

@Brian
I have found this thread which is pretty similar to my case and to what you
suggested, so now I am adapting my code accordingly.

https://sourceforge.net/p/rdkit/mailman/message/35034093/

What you have published sounds interesting. I also have multiple co-crystal
active sites. Could you please send me the link?
One question: what was more successful in your experience when you had a
crystal ligand and a series of analogues that you wanted to place into the
binding pocket? 1) to superpose the MCS between the scaffold of the query
and the crystal ligand, or the MCS of the whole molecules?




On 20 February 2017 at 21:03, Peter S. Shenkin  wrote:

> With Glide, IIRC, this facility is designed for the use case where the
> coordinates of a docked ligand are known (typically from an X-ray
> structure) and the docked ligand shares a SMARTS with the ligands in an
> input file. The SMARTS-matching atoms of each incoming ligand are
> superposed upon the corresponding atoms of the docked ligand and the
> resulting pose is used as an initial guess for the docking.
>
> Some notes:
>
> 0. Greg questions whether there is really a common core in your example,
> and if there's not, it doesn't appear as if the procedure is directly
> applicable. But if it is applicable, read on.
>
> 1. If the SMARTS matches in multiple ways, all are tried, and the best
> docking score among them wins (though there may be a way of requesting the
> N best scores, or even all of them). So if the SMARTS specifies a phenyl
> ring, for example, 12 initial poses will be tried. (If it contains two
> phenyl rings, 24 will be tried)
>
> 2. GLIDE itself does conformer generation, but I'm not sure how it works
> in this procedure. If the SMARTS specifies a rigid core, you probably don't
> need to pre-generate conformers, but if the core is flexible, you are
> probably best off generating them, which of course you are permitted to do.
>
> 3. If you have GLIDE, then  you probably have LigPrep as well. The
> advantage of using LigPrep for your conformation generation would be that
> the strain energy would be written into the output file, and then, when
> used as the input to Glide, it would be taken into account when computing
> the docking score. And it uses the same (or a very similar) force-field
> that Glide itself uses.
>
> 4. I may have some details slightly incorrect, so you might want to
> address your question to Schrödinger tech support.
>
> On Mon, Feb 20, 2017 at 2:15 PM, Brian Kelley 
> wrote:
>
>> I don't know the exact glide procedure, but I did write such a system for
>> OpenEye (POSIT).  The issue you are facing is that the RMSD portion is just
>> a constraint used for docking, it isn't used as the "score", in fact, it
>> can't tell if the conformation interpenetrates the active site or which
>> orientation is better.
>>
>> I believe RDKit can generate conformations with a template, see
>> AllChem.ConstrainedEmbed, this would solve half of your problem in creating
>> conformations that match your template.  You still have the problem with
>> scoring against your active site.  POSIT scored against the shape tanimoto
>> of the active ligands (if any) to try to fill the same space as the known
>> ligands. See rdkit.Chem.rdShapeHelpers.ShapeTanimotoDist
>>
>> This might not be what you want, but we had good success with similar
>> methods and virtual screening, especially when using multiple co-crystal
>> active sites.   I can send you a reference link if this interests you
>>
>> Cheers,
>>  Brian
>>
>> On Mon, Feb 20, 2017 at 12:17 PM, Thomas Evangelidis 
>> wrote:
>>
>>> ​
>>> Greg and Brian,
>>>
>>> Thank you for your useful hints. All the compounds that I want to align
>>> are supposed to belong to the same analogue series so they should shave a
>>> common substructure with substantial size.
>>>
>>> What I want to emulate is the "core restrained docking" with glide,
>>> where you specify the common core of the query and the reference ligand
>>> using a SMARTS pattern and then glide docks the query compound to the
>>> binding pocket but takes care to overlay the core atoms of the query to the
>>> core atoms of the reference compound. Since RDKit does not do docking,
>>> I just generate 30 conformers of each query compound and select the best
>>> one by measuring the RMSD between the core of the query and the core of
>>> the reference after the alignment. Of course the conformations of the core
>>> atoms between the query and the reference are never identical hence the bad
>>> alignment. Is there any smarter way to emulate the "core restrained
>>> docking" with RDKit?
>>>
>>> I will provide you with more info soon (example sdf, results, etc.).
>>>
>>>
>>> ​
>>>
>>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the 

Re: [Rdkit-discuss] problems with installation on conda with python 3.5 32-bit

2017-02-20 Thread Michal Krompiec
Hi Greg,
Thanks for your reply. Actually, >50% of my (prospective) users are stuck
on 32-bit. It would be really nice to have a python3 build (even once a
year) but I understand that the demand is low and waning. I guess the
solution is to use python2.7 for the time being...
Thanks and kind regards,
Michal

On Monday, 20 February 2017, Greg Landrum  wrote:

> Hi Michal,
>
> We've only ever done python2.7 builds for win32 and we stopped doing those
> with the 2016.03 release.
> I will have to check, but I think I probably can start doing these again,
> but I'm reluctant due to the amount of effort required.
> How many users do you need to support who are stuck on 32bit machines?
>
> -greg
>
>
> On Mon, Feb 20, 2017 at 2:18 PM, Michal Krompiec <
> michal.kromp...@gmail.com
> > wrote:
>
>> Hello,
>> I can't install rdkit on anaconda with 32-bit python3 on Windows 7.
>>
>> When I try "the usual", conda tries to install python2.7 into the
>> environment:
>>
>> >conda create -c rdkit -n my-rdkit-env rdkit
>> Fetching package metadata .
>> Solving package specifications: .
>> Package plan for installation in environment
>> C:\Anaconda3_32\envs\my-rdkit-env:
>> The following NEW packages will be INSTALLED:
>> boost:  1.56.0-py27_3 rdkit
>> bzip2:  1.0.6-vc9_3 [vc9]
>> mkl:2017.0.1-0
>> numpy:  1.11.3-py27_0
>> pip:9.0.1-py27_1
>> python: 2.7.13-0
>> rdkit:  2016.03.1-np111py27_1 rdkit
>> setuptools: 27.2.0-py27_1
>> vs2008_runtime: 9.00.30729.5054-0
>> wheel:  0.29.0-py27_0
>> zlib:   1.2.8-vc9_3 [vc9]
>>
>> If I create an empty environment, load python 3.5 into it and try
>> installing rdkit, I get an error:
>>
>> >conda create -n my-rdkit-env python=3.5
>> Fetching package metadata ...
>> Solving package specifications: .
>> Package plan for installation in environment
>> C:\Anaconda3_32\envs\my-rdkit-env:
>> The following NEW packages will be INSTALLED:
>> pip:9.0.1-py35_1
>> python: 3.5.2-0
>> setuptools: 27.2.0-py35_1
>> vs2015_runtime: 14.0.25123-0
>> wheel:  0.29.0-py35_0
>> Proceed ([y]/n)?
>> #
>> # To activate this environment, use:
>> # > activate my-rdkit-env
>> #
>> # To deactivate this environment, use:
>> # > deactivate my-rdkit-env
>> #
>> # * for power-users using bash, you must source
>> #
>>
>> >conda install --name my-rdkit-env -f --channel
>> https://conda.anaconda.org/rdkit rdkit
>> Fetching package metadata .
>> Solving package specifications: .
>>
>> UnsatisfiableError: The following specifications were found to be in
>> conflict:
>>   - python 3.5*
>>   - rdkit -> python 2.7*
>> Use "conda info " to see the dependencies for each package.
>>
>>
>> I managed to install rdkit without any problems on the same machine in
>> 64-bit anaconda with python3.5, but I need a separate 32-bit build to
>> support users with 32-bit machines. Any help will be appreciated.
>>
>> Thanks and best regards,
>>
>> Michal
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> 
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Peter S. Shenkin
With Glide, IIRC, this facility is designed for the use case where the
coordinates of a docked ligand are known (typically from an X-ray
structure) and the docked ligand shares a SMARTS with the ligands in an
input file. The SMARTS-matching atoms of each incoming ligand are
superposed upon the corresponding atoms of the docked ligand and the
resulting pose is used as an initial guess for the docking.

Some notes:

0. Greg questions whether there is really a common core in your example,
and if there's not, it doesn't appear as if the procedure is directly
applicable. But if it is applicable, read on.

1. If the SMARTS matches in multiple ways, all are tried, and the best
docking score among them wins (though there may be a way of requesting the
N best scores, or even all of them). So if the SMARTS specifies a phenyl
ring, for example, 12 initial poses will be tried. (If it contains two
phenyl rings, 24 will be tried)

2. GLIDE itself does conformer generation, but I'm not sure how it works in
this procedure. If the SMARTS specifies a rigid core, you probably don't
need to pre-generate conformers, but if the core is flexible, you are
probably best off generating them, which of course you are permitted to do.

3. If you have GLIDE, then  you probably have LigPrep as well. The
advantage of using LigPrep for your conformation generation would be that
the strain energy would be written into the output file, and then, when
used as the input to Glide, it would be taken into account when computing
the docking score. And it uses the same (or a very similar) force-field
that Glide itself uses.

4. I may have some details slightly incorrect, so you might want to address
your question to Schrödinger tech support.

On Mon, Feb 20, 2017 at 2:15 PM, Brian Kelley  wrote:

> I don't know the exact glide procedure, but I did write such a system for
> OpenEye (POSIT).  The issue you are facing is that the RMSD portion is just
> a constraint used for docking, it isn't used as the "score", in fact, it
> can't tell if the conformation interpenetrates the active site or which
> orientation is better.
>
> I believe RDKit can generate conformations with a template, see
> AllChem.ConstrainedEmbed, this would solve half of your problem in creating
> conformations that match your template.  You still have the problem with
> scoring against your active site.  POSIT scored against the shape tanimoto
> of the active ligands (if any) to try to fill the same space as the known
> ligands. See rdkit.Chem.rdShapeHelpers.ShapeTanimotoDist
>
> This might not be what you want, but we had good success with similar
> methods and virtual screening, especially when using multiple co-crystal
> active sites.   I can send you a reference link if this interests you
>
> Cheers,
>  Brian
>
> On Mon, Feb 20, 2017 at 12:17 PM, Thomas Evangelidis 
> wrote:
>
>> ​
>> Greg and Brian,
>>
>> Thank you for your useful hints. All the compounds that I want to align
>> are supposed to belong to the same analogue series so they should shave a
>> common substructure with substantial size.
>>
>> What I want to emulate is the "core restrained docking" with glide, where
>> you specify the common core of the query and the reference ligand using a
>> SMARTS pattern and then glide docks the query compound to the binding
>> pocket but takes care to overlay the core atoms of the query to the core
>> atoms of the reference compound. Since RDKit does not do docking, I just
>> generate 30 conformers of each query compound and select the best one by
>> measuring the RMSD between the core of the query and the core of the
>> reference after the alignment. Of course the conformations of the core
>> atoms between the query and the reference are never identical hence the bad
>> alignment. Is there any smarter way to emulate the "core restrained
>> docking" with RDKit?
>>
>> I will provide you with more info soon (example sdf, results, etc.).
>>
>>
>> ​
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Brian Kelley
I don't know the exact glide procedure, but I did write such a system for
OpenEye (POSIT).  The issue you are facing is that the RMSD portion is just
a constraint used for docking, it isn't used as the "score", in fact, it
can't tell if the conformation interpenetrates the active site or which
orientation is better.

I believe RDKit can generate conformations with a template, see
AllChem.ConstrainedEmbed, this would solve half of your problem in creating
conformations that match your template.  You still have the problem with
scoring against your active site.  POSIT scored against the shape tanimoto
of the active ligands (if any) to try to fill the same space as the known
ligands. See rdkit.Chem.rdShapeHelpers.ShapeTanimotoDist

This might not be what you want, but we had good success with similar
methods and virtual screening, especially when using multiple co-crystal
active sites.   I can send you a reference link if this interests you

Cheers,
 Brian

On Mon, Feb 20, 2017 at 12:17 PM, Thomas Evangelidis 
wrote:

> ​
> Greg and Brian,
>
> Thank you for your useful hints. All the compounds that I want to align
> are supposed to belong to the same analogue series so they should shave a
> common substructure with substantial size.
>
> What I want to emulate is the "core restrained docking" with glide, where
> you specify the common core of the query and the reference ligand using a
> SMARTS pattern and then glide docks the query compound to the binding
> pocket but takes care to overlay the core atoms of the query to the core
> atoms of the reference compound. Since RDKit does not do docking, I just
> generate 30 conformers of each query compound and select the best one by
> measuring the RMSD between the core of the query and the core of the
> reference after the alignment. Of course the conformations of the core
> atoms between the query and the reference are never identical hence the bad
> alignment. Is there any smarter way to emulate the "core restrained
> docking" with RDKit?
>
> I will provide you with more info soon (example sdf, results, etc.).
>
>
> ​
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Thomas Evangelidis
​
Greg and Brian,

Thank you for your useful hints. All the compounds that I want to align are
supposed to belong to the same analogue series so they should shave a
common substructure with substantial size.

What I want to emulate is the "core restrained docking" with glide, where
you specify the common core of the query and the reference ligand using a
SMARTS pattern and then glide docks the query compound to the binding
pocket but takes care to overlay the core atoms of the query to the core
atoms of the reference compound. Since RDKit does not do docking, I just
generate 30 conformers of each query compound and select the best one by
measuring the RMSD between the core of the query and the core of the
reference after the alignment. Of course the conformations of the core
atoms between the query and the reference are never identical hence the bad
alignment. Is there any smarter way to emulate the "core restrained
docking" with RDKit?

I will provide you with more info soon (example sdf, results, etc.).


​
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Brian Kelley
I believe (Greg can correct me) to align the bemis-murcko scaffold, you
could

(1) extract the original atom pairs and send them to the RMSD algorithm
(2) take a bemis scaffold and convert it to a substructure query for use in
the RMSD algorithm.  In either case the RMSD is the rmsd of the scaffold
atoms, not the rest of the molecule.  Below is a little snippet that I
believes does this.

from rdkit.Chem import AllChem
from rdkit.Chem import rdqueries, rdMolAlign

mol = Chem.MolFromMolBlock( mol block )
print Chem.MolToSmiles(m)

murcko = AllChem.MurckoDecompose(mol)
print "murko",  Chem.MolToSmiles(murcko)

# bemis scaffolds match everything
q = rdqueries.AtomNumGreaterQueryAtom(0)
bemis = Chem.RWMol(murcko)
for atom in bemis.GetAtoms():
   bemis.ReplaceAtom(atom.GetIdx(), q)

rmsd = Chem.rdMolAlign.AlignMol( bemis, m )
print rmsd





On Mon, Feb 20, 2017 at 11:21 AM, Greg Landrum 
wrote:

> HI Thomas,
>
> To be sure we're talking about the same thing: rdMolAlign.GetO3A() is an
> implementation of the Open3DAlign algorithm. This is an unsupervised
> approach that uses a clever algorithm to come up with an atom-atom mapping
> between the two molecules you give it. It's not always going to pick the
> same atoms to align that you would.
>
> To answer the original question: if the two molecules you want to align do
> not share the same scaffold (or at least have a lot in common in the core
> of the molecule), it's unlikely that an MCS-based alignment is going to
> help.
>
> To answer your direct question here, the scaffold finding code in
> rdkit.Chem should be preserving coordinates. Here's a simple demonstration
> of that:
>
> In [3]: m =Chem.AddHs(Chem.MolFromSmiles('CC1CO1'))
>
> In [4]: AllChem.EmbedMolecule(m,AllChem.ETKDG())
> Out[4]: 0
>
> In [5]: nh = Chem.RemoveHs(m)
>
> In [6]: print(Chem.MolToMolBlock(nh))
>
>  RDKit  3D
>
>   4  4  0  0  0  0  0  0  0  0999 V2000
>-1.18500.2738   -0.1814 C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.0992   -0.4987   -0.2391 C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.32290.12410.2290 C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.6874   -0.75581.1165 O   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   2  3  1  0
>   3  4  1  0
>   4  2  1  0
> M  END
>
>
> In [7]: from rdkit.Chem import MurckoDecompose
>
> In [8]: scaff = Chem.MurckoDecompose(nh)
>
> In [10]: print(Chem.MolToMolBlock(scaff))
>
>  RDKit  3D
>
>   3  3  0  0  0  0  0  0  0  0999 V2000
> 0.0992   -0.4987   -0.2391 C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.32290.12410.2290 C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.6874   -0.75581.1165 O   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   2  3  1  0
>   3  1  1  0
> M  END
>
>
> You could help me provide a better answer here by providing a couple of
> example SDFs that you'd like to align, ideally together with a bit of
> sample code showing what you have tried that produces alignments you are
> unhappy with.
>
>
> -greg
>
>
>
>
> On Mon, Feb 20, 2017 at 1:54 PM, Thomas Evangelidis 
> wrote:
>
>> As a follow up question on this topic, I would like to ask if
>> MurckoScaffold.GetScaffoldForMol(mol) returns the scaffold of mol with
>> different coordinates?
>> I am asking this because when I use the transformation matrix of the
>> alignment of the cores of the probe and the reference molecules, in order
>> to align the whole probe to the reference molecule, the two molecules don't
>> seem to be aligned (they are in distance). Basically I do this:
>>
>> qcore = MurckoScaffold.GetScaffoldForMol(qmol)
>> refcore = MurckoScaffold.GetScaffoldForMol(refmol)
>> pyO3A = rdMolAlign.GetO3A(qcore, refcore, prbCid=qconfID, refCid=0,
>> reflect=True)
>> AllChem.TransformMol(qmol, bestRMSDTrans[1], confId=bestconfID,
>> keepConfs=False)
>>
>> and then I write the qmol in an sdf file. But when I visualize it the
>> qmol is far from the refmol!
>>
>>
>>
>>
>>
>> On 20 February 2017 at 02:33, Thomas Evangelidis 
>> wrote:
>>
>>> Dear all,
>>>
>>> I want to align 250 compounds that binding to the same pocket to one of
>>> the 9 available crystal ligands. I chose the reference ligand based on the
>>> Morgan2 similarity to the probe molecule. Then I align the 2 compounds
>>> using:
>>>
>>> pyO3A = rdMolAlign.GetO3A(qmol, refmol, prbCid=qconfID, refCid=0,
>>> reflect=True)
>>> RMSD = pyO3A.Align()
>>>
>>> ​and keep only the conformer of the probe with the lowest RMSD to the
>>> reference compound. However, the alignment looks terrible when I
>>> visualize it, so I would like to ask if there is any way to align the
>>> maximum common substructure only. I tried to align only the core of both
>>> molecules as defined by MurckoScaffold.GetScaffoldForMol(mol)​, but
>>> still the alignment looks bad. I have seen in the documentation how to find
>>> the maximum common substructure with rdFMCS.FindMCS but before I engage
>>> into 

Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Greg Landrum
HI Thomas,

To be sure we're talking about the same thing: rdMolAlign.GetO3A() is an
implementation of the Open3DAlign algorithm. This is an unsupervised
approach that uses a clever algorithm to come up with an atom-atom mapping
between the two molecules you give it. It's not always going to pick the
same atoms to align that you would.

To answer the original question: if the two molecules you want to align do
not share the same scaffold (or at least have a lot in common in the core
of the molecule), it's unlikely that an MCS-based alignment is going to
help.

To answer your direct question here, the scaffold finding code in
rdkit.Chem should be preserving coordinates. Here's a simple demonstration
of that:

In [3]: m =Chem.AddHs(Chem.MolFromSmiles('CC1CO1'))

In [4]: AllChem.EmbedMolecule(m,AllChem.ETKDG())
Out[4]: 0

In [5]: nh = Chem.RemoveHs(m)

In [6]: print(Chem.MolToMolBlock(nh))

 RDKit  3D

  4  4  0  0  0  0  0  0  0  0999 V2000
   -1.18500.2738   -0.1814 C   0  0  0  0  0  0  0  0  0  0  0  0
0.0992   -0.4987   -0.2391 C   0  0  0  0  0  0  0  0  0  0  0  0
1.32290.12410.2290 C   0  0  0  0  0  0  0  0  0  0  0  0
0.6874   -0.75581.1165 O   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
  2  3  1  0
  3  4  1  0
  4  2  1  0
M  END


In [7]: from rdkit.Chem import MurckoDecompose

In [8]: scaff = Chem.MurckoDecompose(nh)

In [10]: print(Chem.MolToMolBlock(scaff))

 RDKit  3D

  3  3  0  0  0  0  0  0  0  0999 V2000
0.0992   -0.4987   -0.2391 C   0  0  0  0  0  0  0  0  0  0  0  0
1.32290.12410.2290 C   0  0  0  0  0  0  0  0  0  0  0  0
0.6874   -0.75581.1165 O   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
  2  3  1  0
  3  1  1  0
M  END


You could help me provide a better answer here by providing a couple of
example SDFs that you'd like to align, ideally together with a bit of
sample code showing what you have tried that produces alignments you are
unhappy with.


-greg




On Mon, Feb 20, 2017 at 1:54 PM, Thomas Evangelidis 
wrote:

> As a follow up question on this topic, I would like to ask if
> MurckoScaffold.GetScaffoldForMol(mol) returns the scaffold of mol with
> different coordinates?
> I am asking this because when I use the transformation matrix of the
> alignment of the cores of the probe and the reference molecules, in order
> to align the whole probe to the reference molecule, the two molecules don't
> seem to be aligned (they are in distance). Basically I do this:
>
> qcore = MurckoScaffold.GetScaffoldForMol(qmol)
> refcore = MurckoScaffold.GetScaffoldForMol(refmol)
> pyO3A = rdMolAlign.GetO3A(qcore, refcore, prbCid=qconfID, refCid=0,
> reflect=True)
> AllChem.TransformMol(qmol, bestRMSDTrans[1], confId=bestconfID,
> keepConfs=False)
>
> and then I write the qmol in an sdf file. But when I visualize it the qmol
> is far from the refmol!
>
>
>
>
>
> On 20 February 2017 at 02:33, Thomas Evangelidis 
> wrote:
>
>> Dear all,
>>
>> I want to align 250 compounds that binding to the same pocket to one of
>> the 9 available crystal ligands. I chose the reference ligand based on the
>> Morgan2 similarity to the probe molecule. Then I align the 2 compounds
>> using:
>>
>> pyO3A = rdMolAlign.GetO3A(qmol, refmol, prbCid=qconfID, refCid=0,
>> reflect=True)
>> RMSD = pyO3A.Align()
>>
>> ​and keep only the conformer of the probe with the lowest RMSD to the
>> reference compound. However, the alignment looks terrible when I
>> visualize it, so I would like to ask if there is any way to align the
>> maximum common substructure only. I tried to align only the core of both
>> molecules as defined by MurckoScaffold.GetScaffoldForMol(mol)​, but
>> still the alignment looks bad. I have seen in the documentation how to find
>> the maximum common substructure with rdFMCS.FindMCS but before I engage
>> into programming it I would like to know if there is any automatic way to
>> find it on the fly while aligning the 2 molecules.
>>
>>
>>
>> --
>>
>> ==
>>
>> Thomas Evangelidis
>>
>> Research Specialist
>> CEITEC - Central European Institute of Technology
>> Masaryk University
>> Kamenice 5/A35/1S081,
>> 62500 Brno, Czech Republic
>>
>> email: tev...@pharm.uoa.gr
>>
>>   teva...@gmail.com
>>
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> Research Specialist
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081,
> 62500 Brno, Czech Republic
>
> email: tev...@pharm.uoa.gr
>
>   teva...@gmail.com
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> 

Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Francois BERENGER
At least for the MCS calculation, there is an R package
for chemistry:

https://bioconductor.org/packages/release/bioc/vignettes/fmcsR/inst/doc/fmcsR.html

On 02/19/2017 07:33 PM, Thomas Evangelidis wrote:
> Dear all,
>
> I want to align 250 compounds that binding to the same pocket to one of
> the 9 available crystal ligands. I chose the reference ligand based on
> the Morgan2 similarity to the probe molecule. Then I align the 2
> compounds using:
>
> pyO3A = rdMolAlign.GetO3A(qmol, refmol, prbCid=qconfID, refCid=0,
> reflect=True)
> RMSD = pyO3A.Align()
>
> ​and keep only the conformer of the probe with the lowest RMSD to the
> reference compound. However, the alignment looks terrible when I
> visualize it, so I would like to ask if there is any way to align the
> maximum common substructure only. I tried to align only the core of both
> molecules as defined by MurckoScaffold.GetScaffoldForMol(mol)​, but
> still the alignment looks bad. I have seen in the documentation how to
> find the maximum common substructure with rdFMCS.FindMCS but before I
> engage into programming it I would like to know if there is any
> automatic way to find it on the fly while aligning the 2 molecules.
>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> Research Specialist
>
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081,
> 62500 Brno, Czech Republic
>
> email: tev...@pharm.uoa.gr 
>
>   teva...@gmail.com 
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
> 
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problems with installation on conda with python 3.5 32-bit

2017-02-20 Thread Greg Landrum
Hi Michal,

We've only ever done python2.7 builds for win32 and we stopped doing those
with the 2016.03 release.
I will have to check, but I think I probably can start doing these again,
but I'm reluctant due to the amount of effort required.
How many users do you need to support who are stuck on 32bit machines?

-greg


On Mon, Feb 20, 2017 at 2:18 PM, Michal Krompiec 
wrote:

> Hello,
> I can't install rdkit on anaconda with 32-bit python3 on Windows 7.
>
> When I try "the usual", conda tries to install python2.7 into the
> environment:
>
> >conda create -c rdkit -n my-rdkit-env rdkit
> Fetching package metadata .
> Solving package specifications: .
> Package plan for installation in environment C:\Anaconda3_32\envs\my-rdkit-
> env:
> The following NEW packages will be INSTALLED:
> boost:  1.56.0-py27_3 rdkit
> bzip2:  1.0.6-vc9_3 [vc9]
> mkl:2017.0.1-0
> numpy:  1.11.3-py27_0
> pip:9.0.1-py27_1
> python: 2.7.13-0
> rdkit:  2016.03.1-np111py27_1 rdkit
> setuptools: 27.2.0-py27_1
> vs2008_runtime: 9.00.30729.5054-0
> wheel:  0.29.0-py27_0
> zlib:   1.2.8-vc9_3 [vc9]
>
> If I create an empty environment, load python 3.5 into it and try
> installing rdkit, I get an error:
>
> >conda create -n my-rdkit-env python=3.5
> Fetching package metadata ...
> Solving package specifications: .
> Package plan for installation in environment C:\Anaconda3_32\envs\my-rdkit-
> env:
> The following NEW packages will be INSTALLED:
> pip:9.0.1-py35_1
> python: 3.5.2-0
> setuptools: 27.2.0-py35_1
> vs2015_runtime: 14.0.25123-0
> wheel:  0.29.0-py35_0
> Proceed ([y]/n)?
> #
> # To activate this environment, use:
> # > activate my-rdkit-env
> #
> # To deactivate this environment, use:
> # > deactivate my-rdkit-env
> #
> # * for power-users using bash, you must source
> #
>
> >conda install --name my-rdkit-env -f --channel
> https://conda.anaconda.org/rdkit rdkit
> Fetching package metadata .
> Solving package specifications: .
>
> UnsatisfiableError: The following specifications were found to be in
> conflict:
>   - python 3.5*
>   - rdkit -> python 2.7*
> Use "conda info " to see the dependencies for each package.
>
>
> I managed to install rdkit without any problems on the same machine in
> 64-bit anaconda with python3.5, but I need a separate 32-bit build to
> support users with 32-bit machines. Any help will be appreciated.
>
> Thanks and best regards,
>
> Michal
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Thomas Evangelidis
As a follow up question on this topic, I would like to ask if
MurckoScaffold.GetScaffoldForMol(mol) returns the scaffold of mol with
different coordinates?
I am asking this because when I use the transformation matrix of the
alignment of the cores of the probe and the reference molecules, in order
to align the whole probe to the reference molecule, the two molecules don't
seem to be aligned (they are in distance). Basically I do this:

qcore = MurckoScaffold.GetScaffoldForMol(qmol)
refcore = MurckoScaffold.GetScaffoldForMol(refmol)
pyO3A = rdMolAlign.GetO3A(qcore, refcore, prbCid=qconfID, refCid=0,
reflect=True)
AllChem.TransformMol(qmol, bestRMSDTrans[1], confId=bestconfID,
keepConfs=False)

and then I write the qmol in an sdf file. But when I visualize it the qmol
is far from the refmol!





On 20 February 2017 at 02:33, Thomas Evangelidis  wrote:

> Dear all,
>
> I want to align 250 compounds that binding to the same pocket to one of
> the 9 available crystal ligands. I chose the reference ligand based on the
> Morgan2 similarity to the probe molecule. Then I align the 2 compounds
> using:
>
> pyO3A = rdMolAlign.GetO3A(qmol, refmol, prbCid=qconfID, refCid=0,
> reflect=True)
> RMSD = pyO3A.Align()
>
> ​and keep only the conformer of the probe with the lowest RMSD to the
> reference compound. However, the alignment looks terrible when I visualize
> it, so I would like to ask if there is any way to align the maximum common
> substructure only. I tried to align only the core of both molecules as
> defined by MurckoScaffold.GetScaffoldForMol(mol)​, but still the
> alignment looks bad. I have seen in the documentation how to find the
> maximum common substructure with rdFMCS.FindMCS but before I engage into
> programming it I would like to know if there is any automatic way to find
> it on the fly while aligning the 2 molecules.
>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> Research Specialist
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081,
> 62500 Brno, Czech Republic
>
> email: tev...@pharm.uoa.gr
>
>   teva...@gmail.com
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>


-- 

==

Thomas Evangelidis

Research Specialist
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/1S081,
62500 Brno, Czech Republic

email: tev...@pharm.uoa.gr

  teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss