Hi Alexis,

I did some more investigation. The fragment descriptors are parsed from a
CSV file located in RDConfig.RDDataDir:
On my machine I see this:

>>> import os
>>> from rdkit.Chem import Fragments
>>> len([f for f in dir(Fragments) if f.startswith("fr_")])
85
>>> from rdkit import RDConfig
>>> fragment_descriptors_csv = os.path.join(RDConfig.RDDataDir,
'FragmentDescriptors.csv')
>>> os.path.exists(fragment_descriptors_csv)
True
>>> with open(fragment_descriptors_csv) as hnd:
...   print(len([line for line in hnd if (not line.startswith("#")) and
len(line.split('\t')) >= 3]))
...
86

The 86/85 discrepancy is caused by the fact that the fr_sulfone entry in
the CSV file is duplicate.

If the CSV file existed and were empty no fragment descriptors would be
available.
Similarly, if the file did not exist (maybe because RDConfig.RDDataDir is
misconfigured), no fragment descriptors would be available and no warning
would be printed, as in _LoadPatterns() (rdkit/Chem/Fragments.py/) an
IOError exception will just be ignored.

I hope the above helps troubleshooting.

Cheers,
p.

On Wed, Sep 8, 2021 at 2:50 PM Alexis Parenty <alexis.parenty.h...@gmail.com>
wrote:

> Hi Paolo,
> Thanks a lot for your response. I am going to try rdkit 2021.03.5 right
> now...
>
> I have checked where I have installed the previous built: I did use
> conda-forge on both platforms:
>
> [image: image.png]
>
> Weird...
>
> best,
>
> Alexis
>
>
>
> On Wed, 8 Sept 2021 at 14:04, Paolo Tosco <paolo.tosco.m...@gmail.com>
> wrote:
>
>> Hi Alexis,
>>
>> I have just installed rdkit 2021.03.5 from the conda-forge channel on a
>> Windows machine and 208 descriptors are indeed available.
>>
>> >>> import sys
>> >>> sys.platform
>> 'win32'
>> >>> import rdkit
>> >>> rdkit.__version__
>> '2021.03.5'
>> >>> from rdkit.Chem import Descriptors
>> >>> len(Descriptors._descList)
>> 208
>>
>> The number of available descriptors depends on how RDKit is built.
>> My guess is that the RDKit installation that you have on Windows does not
>> come from the conda-forge channel and was built differently from the one
>> you have installed on Linux.
>>
>> Cheers,
>> p.
>>
>> On Wed, Sep 8, 2021 at 11:30 AM Alexis Parenty <
>> alexis.parenty.h...@gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> I have noticed some inconsistencies with the list of rdkit chemical
>>> descriptor available between my Windows machine and my Linux machine.
>>> I am running the same rdkit version on both platforms (2021.03.1) on the
>>> same 3.9 python version.
>>>
>>> running the following from windows:
>>>
>>>
>>> print(len(Descriptors._descList))
>>> for each_descriptor in Descriptors._descList:
>>>     print(each_descriptor[0])
>>>
>>>
>>> ==> 123
>>> and no fragment descriptors ("fr_xxx")
>>>
>>> running the same from linux:
>>> ==> 208
>>> (same descriptors as on Windows + all the descriptors from the
>>> Chem.Fragments module)
>>>
>>>
>>> Could the fragment descriptors available on Linux be added to the list
>>> of available fragments on Windows so that the ML models stay cross platform
>>> compatible?
>>>
>>> Many thanks and regards,
>>>
>>> Alexis
>>>
>>>
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to