Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
Tri-anything groups can be considered one by one after the remaining heavy atoms have been aligned. This turns a combinatorial explosion into a linear algorithm for these groups. (Well, it would be linear in number of tri-anything groups, but it gets more complicated if the anythings are more than monatomic.) This would matter from the point of view of RMSD if binding conformations were being compared to each other or to free molecules, or when free molecules were being compared to each other in a situation where steric hindrance affects some tri-something group differently among different conformations. Considering the tri-anything groups would factor in significant deviations from the local equilibrium geometry It would also matter in atomic mappings if you really wanted to know which hydrogen (or anything else) in a conformer alligns to a particular tri-anything bond in a reference structure. For example, you might have a methyl group in the reference structure where one H points into a pocket in an active site and, in a series of analogs, you want to try substituting the corresponding H some R group or groups. -P. On Dec 22, 2016 11:38 AM, "Brian Cole" wrote: > RMSD with auto-morph symmetries with hydrogens are crazy expensive to > calculate. Symmetry should be on by default, but without hydrogens. Would > even love to see the RMSD auto-morph symmetry code ignore trifluro type of > groups too as they dramatically increase the cost of the computation with > little added value. > > On Thu, Dec 22, 2016 at 10:27 AM, Greg Landrum > wrote: > >> >> On Thu, Dec 22, 2016 at 4:06 PM, JW Feng wrote: >> >>> >>> Thanks for confirming the bug. I also vote for changing the code to use >>> only heavy atoms. Is symmetry taken into consideration when calculating >>> RMS during the pruning step? >>> >> >> Symmetry is not taken into account, once the code to do that is available >> in C++ (Peter Gedeck is working on this), we'll add that option too. >> >> -greg >> >> >> >> >> -- >> Developer Access Program for Intel Xeon Phi Processors >> Access to Intel Xeon Phi processor-based developer platforms. >> With one year of Intel Parallel Studio XE. >> Training and support from Colfax. >> Order your platform today.http://sdm.link/intel >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/intel > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
Missed the swap == swap with same type. There probably is some moment based heuristic you use to check for bad outliers. Brian Kelley > On Dec 22, 2016, at 11:49 AM, Greg Landrum wrote: > > >> On Thu, Dec 22, 2016 at 5:37 PM, Brian Cole wrote: >> RMSD with auto-morph symmetries with hydrogens are crazy expensive to >> calculate. Symmetry should be on by default, but without hydrogens. Would >> even love to see the RMSD auto-morph symmetry code ignore trifluro type of >> groups too as they dramatically increase the cost of the computation with >> little added value. > > Ignoring the Hs with "getBestRMS" is certainly a must. The CF3s are also a > good idea. > Maybe it would make sense to have an option to ignore isomorphisms that only > differ by swapping degree 1 atoms. > > -greg > > -- > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/intel > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
So no halogens? That seems... wrong. Brian Kelley > On Dec 22, 2016, at 11:49 AM, Greg Landrum wrote: > > >> On Thu, Dec 22, 2016 at 5:37 PM, Brian Cole wrote: >> RMSD with auto-morph symmetries with hydrogens are crazy expensive to >> calculate. Symmetry should be on by default, but without hydrogens. Would >> even love to see the RMSD auto-morph symmetry code ignore trifluro type of >> groups too as they dramatically increase the cost of the computation with >> little added value. > > Ignoring the Hs with "getBestRMS" is certainly a must. The CF3s are also a > good idea. > Maybe it would make sense to have an option to ignore isomorphisms that only > differ by swapping degree 1 atoms. > > -greg > > -- > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/intel > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
On Thu, Dec 22, 2016 at 5:37 PM, Brian Cole wrote: > RMSD with auto-morph symmetries with hydrogens are crazy expensive to > calculate. Symmetry should be on by default, but without hydrogens. Would > even love to see the RMSD auto-morph symmetry code ignore trifluro type of > groups too as they dramatically increase the cost of the computation with > little added value. > Ignoring the Hs with "getBestRMS" is certainly a must. The CF3s are also a good idea. Maybe it would make sense to have an option to ignore isomorphisms that only differ by swapping degree 1 atoms. -greg -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
RMSD with auto-morph symmetries with hydrogens are crazy expensive to calculate. Symmetry should be on by default, but without hydrogens. Would even love to see the RMSD auto-morph symmetry code ignore trifluro type of groups too as they dramatically increase the cost of the computation with little added value. On Thu, Dec 22, 2016 at 10:27 AM, Greg Landrum wrote: > > On Thu, Dec 22, 2016 at 4:06 PM, JW Feng wrote: > >> >> Thanks for confirming the bug. I also vote for changing the code to use >> only heavy atoms. Is symmetry taken into consideration when calculating >> RMS during the pruning step? >> > > Symmetry is not taken into account, once the code to do that is available > in C++ (Peter Gedeck is working on this), we'll add that option too. > > -greg > > > > > -- > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/intel > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
Hi Greg and Sereina, Thanks for confirming the bug. I also vote for changing the code to use only heavy atoms. Is symmetry taken into consideration when calculating RMS during the pruning step? Best, JW ___ JW Feng, Ph.D. Denali Therapeutics Inc. 151 Oyster Point Blvd, 2nd Floor, South San Francisco, CA 94080 | (650) 270-0628 On Thu, Dec 22, 2016 at 7:02 AM, Sereina wrote: > Hi Greg, > > I would also vote for changing the code such that only heavy atoms are > used in the RMS calculation. > > Best, > Sereina > > > On 22 Dec 2016, at 13:36, Greg Landrum wrote: > > Hi JW, > > On Wed, Dec 21, 2016 at 11:57 PM, JW Feng wrote: > >> >> I am using AllChem.EmbedMultipleConfs to generate conformers. I noticed >> that conformers in the result set are very similar to each other. I wrote >> a test script to calculate RMS for the conformers and may have found a >> bug. Looks like AllChem.EmbedMultipleConfs is calculating RMS using all >> atoms, including Hs, when pruning. The documents says pruning is based on >> heavy atoms RMS. >> > > You're absolutely correct. The code uses all atoms, but the documentation > says it only uses heavy atoms. > So there's either a bug in the documentation or in the code. Here's the > github entry: https://github.com/rdkit/rdkit/issues/1227 > > I believe the right thing to do is change the code, which will lead to > different results from the embedding, but I will hold off on making the fix > to see if any discussion materializes either here or on github. > > > >> Attached is my test script and an input file that illustrates the >> problem. In this script, 50 conformers are generated and pruneRmsThresh is >> 0.5. Pairwise RMS between conformers are >0.5 when H atoms are included. >> Pairwise RMS are <0.5 for many conformers when only heavy atoms are >> included. >> > > Thanks for the detailed report and script to reproduce the problem! > > -greg > > > -- > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/intel___ > > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
On Thu, Dec 22, 2016 at 4:06 PM, JW Feng wrote: > > Thanks for confirming the bug. I also vote for changing the code to use > only heavy atoms. Is symmetry taken into consideration when calculating > RMS during the pruning step? > Symmetry is not taken into account, once the code to do that is available in C++ (Peter Gedeck is working on this), we'll add that option too. -greg -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
Hi Greg, I would also vote for changing the code such that only heavy atoms are used in the RMS calculation. Best, Sereina On 22 Dec 2016, at 13:36, Greg Landrum wrote: > Hi JW, > > On Wed, Dec 21, 2016 at 11:57 PM, JW Feng wrote: > > I am using AllChem.EmbedMultipleConfs to generate conformers. I noticed that > conformers in the result set are very similar to each other. I wrote a test > script to calculate RMS for the conformers and may have found a bug. Looks > like AllChem.EmbedMultipleConfs is calculating RMS using all atoms, including > Hs, when pruning. The documents says pruning is based on heavy atoms RMS. > > You're absolutely correct. The code uses all atoms, but the documentation > says it only uses heavy atoms. > So there's either a bug in the documentation or in the code. Here's the > github entry: https://github.com/rdkit/rdkit/issues/1227 > > I believe the right thing to do is change the code, which will lead to > different results from the embedding, but I will hold off on making the fix > to see if any discussion materializes either here or on github. > > > Attached is my test script and an input file that illustrates the problem. > In this script, 50 conformers are generated and pruneRmsThresh is 0.5. > Pairwise RMS between conformers are >0.5 when H atoms are included. Pairwise > RMS are <0.5 for many conformers when only heavy atoms are included. > > Thanks for the detailed report and script to reproduce the problem! > > -greg > > -- > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform > today.http://sdm.link/intel___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?
Hi JW, On Wed, Dec 21, 2016 at 11:57 PM, JW Feng wrote: > > I am using AllChem.EmbedMultipleConfs to generate conformers. I noticed > that conformers in the result set are very similar to each other. I wrote > a test script to calculate RMS for the conformers and may have found a > bug. Looks like AllChem.EmbedMultipleConfs is calculating RMS using all > atoms, including Hs, when pruning. The documents says pruning is based on > heavy atoms RMS. > You're absolutely correct. The code uses all atoms, but the documentation says it only uses heavy atoms. So there's either a bug in the documentation or in the code. Here's the github entry: https://github.com/rdkit/rdkit/issues/1227 I believe the right thing to do is change the code, which will lead to different results from the embedding, but I will hold off on making the fix to see if any discussion materializes either here or on github. > Attached is my test script and an input file that illustrates the > problem. In this script, 50 conformers are generated and pruneRmsThresh is > 0.5. Pairwise RMS between conformers are >0.5 when H atoms are included. > Pairwise RMS are <0.5 for many conformers when only heavy atoms are > included. > Thanks for the detailed report and script to reproduce the problem! -greg -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss