Hi all,

  I'm getting ready to make a new chemfp release, which means checking it works 
with the latest versions of RDKit.

I noticed there was a change in RDKit 2024.09.4 which affects 
rdMHFPFingerprint. I use that module in chemfp to generate SEC fingerprints 
(SECFP = SMILES extended connectivity fingerprint), a fingerprint type 
described in https://link.springer.com/article/10.1186/s13321-018-0321-8 .

In my test set of 19 structures, 2 of them have different SECFPs.

One of them is PubChem id 9425030 with SMILES 
"CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSC[NH2+]3" from 
https://pubchem.ncbi.nlm.nih.gov/compound/9425030 with

Using 2024.09.3 or earlier I get a set of 74 on-bits.

Using 2024.09.3 or later I get a slightly different set of 73 on-bits. 3 bits 
are missing and 2 new ones are set.

Here is a short program which generates those bits.

from rdkit import Chem
from rdkit.Chem import rdMHFPFingerprint

smiles = "CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSC[NH2+]3"
mol = Chem.MolFromSmiles(smiles)
encoder = rdMHFPFingerprint.MHFPEncoder(0, 0)
fp = encoder.EncodeSECFPMol(mol)
print(list(fp.GetOnBits()))

I've put a more complete reproducible at 
https://paste.sr.ht/~dalke/d9593867e2cb68b95a6a166dd5c5b0fa30362b74 .

I don't see anything in the release notes for 2024.09.4 which would cause this 
change. 

Can anyone here provide insight?

Best regards,

                                Andrew
                                [email protected]





_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to