Just to close the loop on this topic if someone runs into the same issue:

the problem was caused by RDBASE being set to
%CONDA_PREFIX%\Lib\site-packages\rdkit.

As a consequence, the RDConfig.py script would not use the correctly
configured paths set in RDPaths.py, but rather set the RDDataDir path to
%RDBASE%\Data, which is incorrect for the directory layout of a Windows
conda distribution.
Therefore, the FragmentDescriptors.csv was not being found and all fragment
descriptors were missing,

Unsetting the RDBASE environment variable fixed the problem.

Cheers,
Paolo

On Wed, Sep 8, 2021 at 5:40 PM Paolo Tosco <paolo.tosco.m...@gmail.com>
wrote:

> Hi Alexis,
>
> I did some more investigation. The fragment descriptors are parsed from a
> CSV file located in RDConfig.RDDataDir:
> On my machine I see this:
>
> >>> import os
> >>> from rdkit.Chem import Fragments
> >>> len([f for f in dir(Fragments) if f.startswith("fr_")])
> 85
> >>> from rdkit import RDConfig
> >>> fragment_descriptors_csv = os.path.join(RDConfig.RDDataDir,
> 'FragmentDescriptors.csv')
> >>> os.path.exists(fragment_descriptors_csv)
> True
> >>> with open(fragment_descriptors_csv) as hnd:
> ...   print(len([line for line in hnd if (not line.startswith("#")) and
> len(line.split('\t')) >= 3]))
> ...
> 86
>
> The 86/85 discrepancy is caused by the fact that the fr_sulfone entry in
> the CSV file is duplicate.
>
> If the CSV file existed and were empty no fragment descriptors would be
> available.
> Similarly, if the file did not exist (maybe because RDConfig.RDDataDir is
> misconfigured), no fragment descriptors would be available and no warning
> would be printed, as in _LoadPatterns() (rdkit/Chem/Fragments.py/) an
> IOError exception will just be ignored.
>
> I hope the above helps troubleshooting.
>
> Cheers,
> p.
>
> On Wed, Sep 8, 2021 at 2:50 PM Alexis Parenty <
> alexis.parenty.h...@gmail.com> wrote:
>
>> Hi Paolo,
>> Thanks a lot for your response. I am going to try rdkit 2021.03.5 right
>> now...
>>
>> I have checked where I have installed the previous built: I did use
>> conda-forge on both platforms:
>>
>> [image: image.png]
>>
>> Weird...
>>
>> best,
>>
>> Alexis
>>
>>
>>
>> On Wed, 8 Sept 2021 at 14:04, Paolo Tosco <paolo.tosco.m...@gmail.com>
>> wrote:
>>
>>> Hi Alexis,
>>>
>>> I have just installed rdkit 2021.03.5 from the conda-forge channel on a
>>> Windows machine and 208 descriptors are indeed available.
>>>
>>> >>> import sys
>>> >>> sys.platform
>>> 'win32'
>>> >>> import rdkit
>>> >>> rdkit.__version__
>>> '2021.03.5'
>>> >>> from rdkit.Chem import Descriptors
>>> >>> len(Descriptors._descList)
>>> 208
>>>
>>> The number of available descriptors depends on how RDKit is built.
>>> My guess is that the RDKit installation that you have on Windows does
>>> not come from the conda-forge channel and was built differently from
>>> the one you have installed on Linux.
>>>
>>> Cheers,
>>> p.
>>>
>>> On Wed, Sep 8, 2021 at 11:30 AM Alexis Parenty <
>>> alexis.parenty.h...@gmail.com> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I have noticed some inconsistencies with the list of rdkit chemical
>>>> descriptor available between my Windows machine and my Linux machine.
>>>> I am running the same rdkit version on both platforms (2021.03.1) on
>>>> the same 3.9 python version.
>>>>
>>>> running the following from windows:
>>>>
>>>>
>>>> print(len(Descriptors._descList))
>>>> for each_descriptor in Descriptors._descList:
>>>>     print(each_descriptor[0])
>>>>
>>>>
>>>> ==> 123
>>>> and no fragment descriptors ("fr_xxx")
>>>>
>>>> running the same from linux:
>>>> ==> 208
>>>> (same descriptors as on Windows + all the descriptors from the
>>>> Chem.Fragments module)
>>>>
>>>>
>>>> Could the fragment descriptors available on Linux be added to the list
>>>> of available fragments on Windows so that the ML models stay cross platform
>>>> compatible?
>>>>
>>>> Many thanks and regards,
>>>>
>>>> Alexis
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Rdkit-discuss mailing list
>>>> Rdkit-discuss@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>
>>>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to