Hi Paul,
On Tue, Oct 11, 2011 at 7:55 AM, <[email protected]> wrote:
>
> Dear RDkitters,
>
> I'm trying to use Python's multiprocessing module in conjunction with
> RDKit.
>
> It should be applied in 2 cases:
> (1) fingerprint calculation
> &
> (2) Picking Diverse Molecules
>
>
>
> (1)
> "
> from multiprocessing import Pool
> p4 = Pool(processes=4)
> def fps_calc(m):
> fps = [GetMorganFingerprint(x,3) for x in m]
> return fps
> fps = p4.map(fps_calc,ms)
> "
> ==>
> "TypeError: 'Mol' object is not iterable"
I think what you want to do here is:
#-------------------
def fps_calc(m):
fp = GetMorganFingerprint(m,3)
return fp
fps = p4.map(fps_calc,ms)
#-------------------
The map method takes a function and a sequence of objects, it applies
that function to each object in the sequence.
> (2)
> "
> from multiprocessing import Pool
> p4 = Pool(processes=4)
> def distij(i,j,fps=fps):
> return 1-DataStructs.DiceSimilarity(fps[i],fps[j])
>
> def DivSelection(distij,nfps,quantity_train):
> picker = MaxMinPicker()
> picked_indices = picker.LazyPick(distij,nfps,quantity_train)
> return picked_indices
> "
> pickIndices = p4.map(DivSelection, ???)
The MaxMinPicker does not have any way to do the parallelization with
multiprocessing.
-greg
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss