Re: [Rdkit-discuss] Different 3D descriptors depending on mol reading method
Thanks, Diogo! Now the two files are essentially the same. Diogo Martins escreveu no dia quarta, 15/06/2022 à(s) 20:53: > Hi JSousa, > > Adding "removeHs=False" when reading from SDF should fix it. > > Best regards, > Diogo > > On Wed, 15 Jun 2022 at 01:24, J Sousa wrote: > >> Hi Greg, >> >> Including the randomSeed argument in all instances didn't change the >> situation: >> AllChem.EmbedMolecule(mol,useRandomCoords=True,randomSeed=42) >> >> The descriptors are still different using the same randomSeed=42. And >> they are quite different (not just the normal fluctuations from different >> conformers). >> >> Best, >> J >> >> >> >> >> Greg Landrum escreveu no dia quarta, 15/06/2022 >> à(s) 07:09: >> >>> Hi, >>> >>> I guess the differences you are seeing are arising because you have >>> different conformers of the molecule. >>> The conformer generation process in EmbedMolecule() uses a stochastic >>> procedure and if you want to be sure that you get the same results from >>> multiple runs you need to provide a random seed using the randomSeed >>> argument. >>> >>> Please give that a try and see if it helps, >>> -greg >>> >>> >>> >>> >>> On Tue, Jun 14, 2022 at 9:15 PM J Sousa >>> wrote: >>> I'm trying RDKit to calculate 3D descriptors, but I get significant different descriptors if I read molecules from a SMILES file (and clean/optimize the 3D structure before calculating the descriptors) or if I read the SDF file obtained from exactly the same SMILES file using exactly the same code to optimize the structures. Scripts attached. Running smiltodesc_check.py produces descr_myfile.txt Running gen3D_check.py and then descr_from_sdf_check.py produces myfile_descr.txt But the two files are significantly different. Why aren't they the same? Which is wrong? JSousa ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Different 3D descriptors depending on mol reading method
Hi JSousa, Adding "removeHs=False" when reading from SDF should fix it. Best regards, Diogo On Wed, 15 Jun 2022 at 01:24, J Sousa wrote: > Hi Greg, > > Including the randomSeed argument in all instances didn't change the > situation: > AllChem.EmbedMolecule(mol,useRandomCoords=True,randomSeed=42) > > The descriptors are still different using the same randomSeed=42. And they > are quite different (not just the normal fluctuations from different > conformers). > > Best, > J > > > > > Greg Landrum escreveu no dia quarta, 15/06/2022 > à(s) 07:09: > >> Hi, >> >> I guess the differences you are seeing are arising because you have >> different conformers of the molecule. >> The conformer generation process in EmbedMolecule() uses a stochastic >> procedure and if you want to be sure that you get the same results from >> multiple runs you need to provide a random seed using the randomSeed >> argument. >> >> Please give that a try and see if it helps, >> -greg >> >> >> >> >> On Tue, Jun 14, 2022 at 9:15 PM J Sousa wrote: >> >>> I'm trying RDKit to calculate 3D descriptors, but I get significant >>> different descriptors if I read molecules from a SMILES file (and >>> clean/optimize the 3D structure before calculating the descriptors) or if I >>> read the SDF file obtained from exactly the same SMILES file using exactly >>> the same code to optimize the structures. >>> >>> Scripts attached. >>> >>> Running smiltodesc_check.py produces descr_myfile.txt >>> >>> Running gen3D_check.py and then descr_from_sdf_check.py produces >>> myfile_descr.txt >>> >>> But the two files are significantly different. >>> >>> Why aren't they the same? Which is wrong? >>> >>> JSousa >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Different 3D descriptors depending on mol reading method
Hi Greg, Including the randomSeed argument in all instances didn't change the situation: AllChem.EmbedMolecule(mol,useRandomCoords=True,randomSeed=42) The descriptors are still different using the same randomSeed=42. And they are quite different (not just the normal fluctuations from different conformers). Best, J Greg Landrum escreveu no dia quarta, 15/06/2022 à(s) 07:09: > Hi, > > I guess the differences you are seeing are arising because you have > different conformers of the molecule. > The conformer generation process in EmbedMolecule() uses a stochastic > procedure and if you want to be sure that you get the same results from > multiple runs you need to provide a random seed using the randomSeed > argument. > > Please give that a try and see if it helps, > -greg > > > > > On Tue, Jun 14, 2022 at 9:15 PM J Sousa wrote: > >> I'm trying RDKit to calculate 3D descriptors, but I get significant >> different descriptors if I read molecules from a SMILES file (and >> clean/optimize the 3D structure before calculating the descriptors) or if I >> read the SDF file obtained from exactly the same SMILES file using exactly >> the same code to optimize the structures. >> >> Scripts attached. >> >> Running smiltodesc_check.py produces descr_myfile.txt >> >> Running gen3D_check.py and then descr_from_sdf_check.py produces >> myfile_descr.txt >> >> But the two files are significantly different. >> >> Why aren't they the same? Which is wrong? >> >> JSousa >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Different 3D descriptors depending on mol reading method
Hi, I guess the differences you are seeing are arising because you have different conformers of the molecule. The conformer generation process in EmbedMolecule() uses a stochastic procedure and if you want to be sure that you get the same results from multiple runs you need to provide a random seed using the randomSeed argument. Please give that a try and see if it helps, -greg On Tue, Jun 14, 2022 at 9:15 PM J Sousa wrote: > I'm trying RDKit to calculate 3D descriptors, but I get significant > different descriptors if I read molecules from a SMILES file (and > clean/optimize the 3D structure before calculating the descriptors) or if I > read the SDF file obtained from exactly the same SMILES file using exactly > the same code to optimize the structures. > > Scripts attached. > > Running smiltodesc_check.py produces descr_myfile.txt > > Running gen3D_check.py and then descr_from_sdf_check.py produces > myfile_descr.txt > > But the two files are significantly different. > > Why aren't they the same? Which is wrong? > > JSousa > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss