Re: [Rdkit-discuss] Different 3D descriptors depending on mol reading method

2022-06-15 Thread Diogo Martins
Hi JSousa,

Adding "removeHs=False" when reading from SDF should fix it.

Best regards,
Diogo

On Wed, 15 Jun 2022 at 01:24, J Sousa  wrote:

> Hi Greg,
>
> Including the randomSeed argument in all instances didn't change the
> situation:
> AllChem.EmbedMolecule(mol,useRandomCoords=True,randomSeed=42)
>
> The descriptors are still different using the same randomSeed=42. And they
> are quite different (not just the normal fluctuations from different
> conformers).
>
> Best,
> J
>
>
>
>
> Greg Landrum  escreveu no dia quarta, 15/06/2022
> à(s) 07:09:
>
>> Hi,
>>
>> I guess the differences you are seeing are arising because you have
>> different conformers of the molecule.
>> The conformer generation process  in EmbedMolecule() uses a stochastic
>> procedure and if you want to be sure that you get the same results from
>> multiple runs you need to provide a random seed using the randomSeed
>> argument.
>>
>> Please give that a try and see if it helps,
>> -greg
>>
>>
>>
>>
>> On Tue, Jun 14, 2022 at 9:15 PM J Sousa  wrote:
>>
>>> I'm trying RDKit to calculate 3D descriptors, but I get significant
>>> different descriptors if I read molecules from a SMILES file (and
>>> clean/optimize the 3D structure before calculating the descriptors) or if I
>>> read the SDF file obtained from exactly the same SMILES file using exactly
>>> the same code to optimize the structures.
>>>
>>> Scripts attached.
>>>
>>> Running smiltodesc_check.py produces descr_myfile.txt
>>>
>>> Running gen3D_check.py and then descr_from_sdf_check.py produces
>>> myfile_descr.txt
>>>
>>> But the two files are significantly different.
>>>
>>> Why aren't they the same? Which is wrong?
>>>
>>> JSousa
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Different 3D descriptors depending on mol reading method

2022-06-15 Thread J Sousa
Hi Greg,

Including the randomSeed argument in all instances didn't change the
situation:
AllChem.EmbedMolecule(mol,useRandomCoords=True,randomSeed=42)

The descriptors are still different using the same randomSeed=42. And they
are quite different (not just the normal fluctuations from different
conformers).

Best,
J




Greg Landrum  escreveu no dia quarta, 15/06/2022
à(s) 07:09:

> Hi,
>
> I guess the differences you are seeing are arising because you have
> different conformers of the molecule.
> The conformer generation process  in EmbedMolecule() uses a stochastic
> procedure and if you want to be sure that you get the same results from
> multiple runs you need to provide a random seed using the randomSeed
> argument.
>
> Please give that a try and see if it helps,
> -greg
>
>
>
>
> On Tue, Jun 14, 2022 at 9:15 PM J Sousa  wrote:
>
>> I'm trying RDKit to calculate 3D descriptors, but I get significant
>> different descriptors if I read molecules from a SMILES file (and
>> clean/optimize the 3D structure before calculating the descriptors) or if I
>> read the SDF file obtained from exactly the same SMILES file using exactly
>> the same code to optimize the structures.
>>
>> Scripts attached.
>>
>> Running smiltodesc_check.py produces descr_myfile.txt
>>
>> Running gen3D_check.py and then descr_from_sdf_check.py produces
>> myfile_descr.txt
>>
>> But the two files are significantly different.
>>
>> Why aren't they the same? Which is wrong?
>>
>> JSousa
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Different 3D descriptors depending on mol reading method

2022-06-15 Thread Greg Landrum
Hi,

I guess the differences you are seeing are arising because you have
different conformers of the molecule.
The conformer generation process  in EmbedMolecule() uses a stochastic
procedure and if you want to be sure that you get the same results from
multiple runs you need to provide a random seed using the randomSeed
argument.

Please give that a try and see if it helps,
-greg




On Tue, Jun 14, 2022 at 9:15 PM J Sousa  wrote:

> I'm trying RDKit to calculate 3D descriptors, but I get significant
> different descriptors if I read molecules from a SMILES file (and
> clean/optimize the 3D structure before calculating the descriptors) or if I
> read the SDF file obtained from exactly the same SMILES file using exactly
> the same code to optimize the structures.
>
> Scripts attached.
>
> Running smiltodesc_check.py produces descr_myfile.txt
>
> Running gen3D_check.py and then descr_from_sdf_check.py produces
> myfile_descr.txt
>
> But the two files are significantly different.
>
> Why aren't they the same? Which is wrong?
>
> JSousa
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss