Hello,

This was a very strange bug that I saw. I was getting inconsistent
fingerprints using GetMorganFingerprint with useChirality=True, when I used
multiprocessing vs when I ran serially on rdkit 2017.09.1 and 2018.03.2. It
seems to have been fixed in the latest version. Woo! I was just wondering
if anyone has any insights on what was causing this before because I was
stumped for the longest time. Example:

from multiprocessing import Pool
from rdkit import Chem
from rdkit.Chem import AllChem

def compute_ecfp_bitvect(mol, ecfp_power = 11):
    print(Chem.MolToSmiles(mol, isomericSmiles=True))
    print(list(Chem.AllChem.GetMorganFingerprintAsBitVect(mol, radius=2,
nBits=2 ** ecfp_power, useChirality=True).GetOnBits()))
    return Chem.AllChem.GetMorganFingerprintAsBitVect(mol, radius=2,
nBits=2 ** ecfp_power, useChirality=True)

smiles = ["N[C@@H](C)C(=O)O", "N[C@H](C)C(=O)O"]

mol1 = Chem.MolFromSmiles(smiles[0])
mol2 = Chem.MolFromSmiles(smiles[1])
print("with pool")
with Pool(1) as pool:
    jobs = pool.imap(compute_ecfp_bitvect, [mol1,mol2])
    list(jobs)
print("without pool")
[compute_ecfp_bitvect(m) for m in [mol1,mol2]]

===== Output =====
with pool
C[C@H](N)C(=O)O
[1, 283, 389, 537, 650, 786, 807, 1057, 1119, 1171, 1844, 1917]
C[C@@H](N)C(=O)O
[1, 283, 389, 537, 650, 786, 807, 1057, 1119, 1171, 1844, 1917]
without  pool
C[C@H](N)C(=O)O
[1, 283, 389, 650, 786, 807, 1057, 1112, 1171, 1187, 1844, 1917]
C[C@@H](N)C(=O)O
[1, 46, 283, 389, 650, 786, 807, 1057, 1113, 1171, 1844, 1917]

Thanks and hope everyone is staying healthy!
Hao
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to