>
> Just as an FYI: the best easy way, by far, to keep track of whether or not
> you've seen a particular molecule is to use the SMILES.
>

Though as a caveat with SMILES, be aware of issues about partial chirality
and E/Z isomerization specification. "CC=CC(C)(O)CF" is not the same SMILES
as "C/C=C/[C@](C)(O)CF", even though they might refer to the "same"
molecule for your purposes. RDKit canonical SMILES will faithfully render
the stereochemistry information if available, but depending on how you're
reading and/or processing things, you may or may not have that info
properly annotated for the SMILES outputter to use. (Something as simple as
generating 3D coordinates can potentially add that info in. But also, just
because your SDF file has 3D coordinates doesn't necessarily guarantee that
RDKit will completely annotate stereochemical info on the read-in Mol.)

Take a look at `Chem.AssignStereochemistryFrom3D()`,
`Chem.RemoveStereochemistry()` and
`Chem.EnumerateStereoisomers.EnumerateStereoisomers()` if this is
potentially going to be an issue for you.

On Fri, Jan 13, 2023 at 1:41 AM Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Eric,
>
> That would be due to the fix for this bug:
> https://github.com/rdkit/rdkit/issues/5036
> If you were generating the fingerprints on "normal" (i.e.
> hydrogen-suppressed) graphs, you wouldn't notice this one, but the fact
> that you add the Hs before generating the fingerprint causes you to notice
> it.
>
> Just as an FYI: the best easy way, by far, to keep track of whether or not
> you've seen a particular molecule is to use the SMILES.
>
> -greg
>
>
> On Fri, Jan 13, 2023 at 6:27 AM Eric Jonas <jo...@ericjonas.com> wrote:
>
>> Hello! I use the crc of morgan fingerprints as a quick-and-dirty way to
>> keep track of different molecules, but now I realize it might have been too
>> quick and dirty! In particular, there appears to have been a change in the
>> morgan code sometime between 2021.09.02 and 2022.03.05. The following code
>> produces different output under these versions:
>>
>> import rdkit.Chem
>> import pickle
>> from rdkit import Chem
>>
>> import rdkit.Chem.rdMolDescriptors
>> import zlib
>>
>> def get_morgan4_crc32(m):
>>     mf = Chem.rdMolDescriptors.GetHashedMorganFingerprint(m, 4)
>>     morgan4_crc32 = zlib.crc32(mf.ToBinary())
>>     return morgan4_crc32
>>
>> mol = Chem.AddHs(Chem.MolFromSmiles('Oc1cc(O)c(O)c(O)c1'))
>> print(get_morgan4_crc32(mol))
>>
>> 2021.09.2 : 1567135676
>> 2022.03.5 : 204854560
>>
>> I tried looking at the release notes but I didn't seem to see any
>> breaking changes (I might have missed them!) and I tried looking at "blame"
>> for the relevant source but didn't see any seemingly-substantive changes
>> within the relevant timeframe.
>>
>> So am I doing something crazy here, or did something change deliberately,
>> or is it possible this is a bug?
>>
>> ...E
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to