Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Peter S. Shenkin
Tri-anything groups can be considered one by one after the remaining heavy
atoms have been aligned. This turns a combinatorial explosion into a linear
algorithm for these groups. (Well, it would be linear in number of
tri-anything groups, but it gets more complicated if the anythings are more
than monatomic.)

This would matter from the point of view of RMSD if binding conformations
were being compared to each other or to free molecules, or when free
molecules were being compared to each other in a situation where steric
hindrance affects some tri-something group differently among different
conformations. Considering the tri-anything groups would factor in
significant deviations from the local equilibrium geometry

It would also matter in atomic mappings if you really wanted to know which
hydrogen (or anything else) in a conformer alligns to a particular
tri-anything bond in a reference structure. For example, you might have a
methyl group in the reference structure where one H points into a pocket in
an active site and, in a series of analogs, you want to try substituting
the corresponding H some R group or groups.

-P.

On Dec 22, 2016 11:38 AM, "Brian Cole"  wrote:

> RMSD with auto-morph symmetries with hydrogens are crazy expensive to
> calculate. Symmetry should be on by default, but without hydrogens. Would
> even love to see the RMSD auto-morph symmetry code ignore trifluro type of
> groups too as they dramatically increase the cost of the computation with
> little added value.
>
> On Thu, Dec 22, 2016 at 10:27 AM, Greg Landrum 
> wrote:
>
>>
>> On Thu, Dec 22, 2016 at 4:06 PM, JW Feng  wrote:
>>
>>>
>>> Thanks for confirming the bug.  I also vote for changing the code to use
>>> only heavy atoms.  Is symmetry taken into consideration when calculating
>>> RMS during the pruning step?
>>>
>>
>> Symmetry is not taken into account, once the code to do that is available
>> in C++ (Peter Gedeck is working on this), we'll add that option too.
>>
>> -greg
>>
>>
>>
>> 
>> --
>> Developer Access Program for Intel Xeon Phi Processors
>> Access to Intel Xeon Phi processor-based developer platforms.
>> With one year of Intel Parallel Studio XE.
>> Training and support from Colfax.
>> Order your platform today.http://sdm.link/intel
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/intel
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Brian Kelley
Missed the swap == swap with same type.

There probably is some moment based heuristic you use to check for bad outliers.


Brian Kelley

> On Dec 22, 2016, at 11:49 AM, Greg Landrum  wrote:
> 
> 
>> On Thu, Dec 22, 2016 at 5:37 PM, Brian Cole  wrote:
>> RMSD with auto-morph symmetries with hydrogens are crazy expensive to 
>> calculate. Symmetry should be on by default, but without hydrogens. Would 
>> even love to see the RMSD auto-morph symmetry code ignore trifluro type of 
>> groups too as they dramatically increase the cost of the computation with 
>> little added value.  
> 
> Ignoring the Hs with "getBestRMS" is certainly a must. The CF3s are also a 
> good idea.
> Maybe it would make sense to have an option to ignore isomorphisms that only 
> differ by swapping degree 1 atoms.
> 
> -greg
>  
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/intel
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Brian Kelley
So no halogens?   That seems... wrong.


Brian Kelley

> On Dec 22, 2016, at 11:49 AM, Greg Landrum  wrote:
> 
> 
>> On Thu, Dec 22, 2016 at 5:37 PM, Brian Cole  wrote:
>> RMSD with auto-morph symmetries with hydrogens are crazy expensive to 
>> calculate. Symmetry should be on by default, but without hydrogens. Would 
>> even love to see the RMSD auto-morph symmetry code ignore trifluro type of 
>> groups too as they dramatically increase the cost of the computation with 
>> little added value.  
> 
> Ignoring the Hs with "getBestRMS" is certainly a must. The CF3s are also a 
> good idea.
> Maybe it would make sense to have an option to ignore isomorphisms that only 
> differ by swapping degree 1 atoms.
> 
> -greg
>  
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/intel
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Greg Landrum
On Thu, Dec 22, 2016 at 5:37 PM, Brian Cole  wrote:

> RMSD with auto-morph symmetries with hydrogens are crazy expensive to
> calculate. Symmetry should be on by default, but without hydrogens. Would
> even love to see the RMSD auto-morph symmetry code ignore trifluro type of
> groups too as they dramatically increase the cost of the computation with
> little added value.
>

Ignoring the Hs with "getBestRMS" is certainly a must. The CF3s are also a
good idea.
Maybe it would make sense to have an option to ignore isomorphisms that
only differ by swapping degree 1 atoms.

-greg
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Brian Cole
RMSD with auto-morph symmetries with hydrogens are crazy expensive to
calculate. Symmetry should be on by default, but without hydrogens. Would
even love to see the RMSD auto-morph symmetry code ignore trifluro type of
groups too as they dramatically increase the cost of the computation with
little added value.

On Thu, Dec 22, 2016 at 10:27 AM, Greg Landrum 
wrote:

>
> On Thu, Dec 22, 2016 at 4:06 PM, JW Feng  wrote:
>
>>
>> Thanks for confirming the bug.  I also vote for changing the code to use
>> only heavy atoms.  Is symmetry taken into consideration when calculating
>> RMS during the pruning step?
>>
>
> Symmetry is not taken into account, once the code to do that is available
> in C++ (Peter Gedeck is working on this), we'll add that option too.
>
> -greg
>
>
>
> 
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/intel
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread JW Feng
Hi Greg and Sereina,

Thanks for confirming the bug.  I also vote for changing the code to use
only heavy atoms.  Is symmetry taken into consideration when calculating
RMS during the pruning step?

Best,

JW

___
JW Feng, Ph.D.
Denali Therapeutics Inc.
151 Oyster Point Blvd, 2nd Floor, South San Francisco, CA 94080 | (650)
270-0628

On Thu, Dec 22, 2016 at 7:02 AM, Sereina  wrote:

> Hi Greg,
>
> I would also vote for changing the code such that only heavy atoms are
> used in the RMS calculation.
>
> Best,
> Sereina
>
>
> On 22 Dec 2016, at 13:36, Greg Landrum  wrote:
>
> Hi JW,
>
> On Wed, Dec 21, 2016 at 11:57 PM, JW Feng  wrote:
>
>>
>> I am using AllChem.EmbedMultipleConfs to generate conformers.  I noticed
>> that conformers in the result set are very similar to each other.  I wrote
>> a test script to calculate RMS for the conformers and may have found a
>> bug.  Looks like AllChem.EmbedMultipleConfs is calculating RMS using all
>> atoms, including Hs, when pruning.  The documents says pruning is based on
>> heavy atoms RMS.
>>
>
> You're absolutely correct. The code uses all atoms, but the documentation
> says it only uses heavy atoms.
> So there's either a bug in the documentation or in the code. Here's the
> github entry: https://github.com/rdkit/rdkit/issues/1227
>
> I believe the right thing to do is change the code, which will lead to
> different results from the embedding, but I will hold off on making the fix
> to see if any discussion materializes either here or on github.
>
>
>
>> Attached is my test script and an input file that illustrates the
>> problem.  In this script, 50 conformers are generated and pruneRmsThresh is
>> 0.5.  Pairwise RMS between conformers are >0.5 when H atoms are included.
>> Pairwise RMS are <0.5 for many conformers when only heavy atoms are
>> included.
>>
>
> Thanks for the detailed report and script to reproduce the problem!
>
> -greg
>
> 
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/intel___
> 
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Greg Landrum
On Thu, Dec 22, 2016 at 4:06 PM, JW Feng  wrote:

>
> Thanks for confirming the bug.  I also vote for changing the code to use
> only heavy atoms.  Is symmetry taken into consideration when calculating
> RMS during the pruning step?
>

Symmetry is not taken into account, once the code to do that is available
in C++ (Peter Gedeck is working on this), we'll add that option too.

-greg
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Sereina
Hi Greg,

I would also vote for changing the code such that only heavy atoms are used in 
the RMS calculation.

Best,
Sereina


On 22 Dec 2016, at 13:36, Greg Landrum  wrote:

> Hi JW,
> 
> On Wed, Dec 21, 2016 at 11:57 PM, JW Feng  wrote:
> 
> I am using AllChem.EmbedMultipleConfs to generate conformers.  I noticed that 
> conformers in the result set are very similar to each other.  I wrote a test 
> script to calculate RMS for the conformers and may have found a bug.  Looks 
> like AllChem.EmbedMultipleConfs is calculating RMS using all atoms, including 
> Hs, when pruning.  The documents says pruning is based on heavy atoms RMS.
> 
> You're absolutely correct. The code uses all atoms, but the documentation 
> says it only uses heavy atoms.
> So there's either a bug in the documentation or in the code. Here's the 
> github entry: https://github.com/rdkit/rdkit/issues/1227
> 
> I believe the right thing to do is change the code, which will lead to 
> different results from the embedding, but I will hold off on making the fix 
> to see if any discussion materializes either here or on github.
> 
>  
> Attached is my test script and an input file that illustrates the problem.  
> In this script, 50 conformers are generated and pruneRmsThresh is 0.5.  
> Pairwise RMS between conformers are >0.5 when H atoms are included.  Pairwise 
> RMS are <0.5 for many conformers when only heavy atoms are included.
> 
> Thanks for the detailed report and script to reproduce the problem!
> 
> -greg
>  
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform 
> today.http://sdm.link/intel___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Greg Landrum
Hi JW,

On Wed, Dec 21, 2016 at 11:57 PM, JW Feng  wrote:

>
> I am using AllChem.EmbedMultipleConfs to generate conformers.  I noticed
> that conformers in the result set are very similar to each other.  I wrote
> a test script to calculate RMS for the conformers and may have found a
> bug.  Looks like AllChem.EmbedMultipleConfs is calculating RMS using all
> atoms, including Hs, when pruning.  The documents says pruning is based on
> heavy atoms RMS.
>

You're absolutely correct. The code uses all atoms, but the documentation
says it only uses heavy atoms.
So there's either a bug in the documentation or in the code. Here's the
github entry: https://github.com/rdkit/rdkit/issues/1227

I believe the right thing to do is change the code, which will lead to
different results from the embedding, but I will hold off on making the fix
to see if any discussion materializes either here or on github.



> Attached is my test script and an input file that illustrates the
> problem.  In this script, 50 conformers are generated and pruneRmsThresh is
> 0.5.  Pairwise RMS between conformers are >0.5 when H atoms are included.
> Pairwise RMS are <0.5 for many conformers when only heavy atoms are
> included.
>

Thanks for the detailed report and script to reproduce the problem!

-greg
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss