Hi all, I am currently trying to parallelize part of a script using RDKIT and concurrent.futures. The function that is executed in parallel returns processed molecules as RDKIT Mol objects.
Without parallelization everything is fine and the Mol objects keep all the
properties that they had before the processing. When using
concurrent.futures, the returned molecules lose all properties and seem to
be created from scratch maybe with unknown side-effects.
I am wondering if anyone experienced the same issue and knows how to
circumvent this. I attached a ipython notebook with a small script
demonstrating the issue.
Best,
Michael
Example Code:
from concurrent import futures
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import IPythonConsole
def process(mol):
if not "Name" in mol.GetPropNames():
print "Processing: Name missing"
mol.SetProp("Processed","True")
return mol
mol = Chem.MolFromSmiles("N[C@@H](C)C(=O)O")
mol.SetProp("Name","Alanine")
with futures.ProcessPoolExecutor(max_workers=1) as pool:
future = pool.submit(process, mol)
molOut = future.result()
if "Name" not in molOut.GetPropNames():
print "Result: Name missing"
if "Processed" not in molOut.GetPropNames():
print "Result: Processed missing"
RDKIT_ParallelProblem (1).ipynb
Description: Binary data
------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

