You beat me by seconds ... I hate it when you do that ;-)

-----Original Message-----
From: Greg Landrum [mailto:[email protected]] 
Sent: Tuesday, October 11, 2011 4:23 PM
To: [email protected]
Cc: RDKit Discuss
Subject: Re: [Rdkit-discuss] multiprocessing & rdkit

Hi Paul,

On Tue, Oct 11, 2011 at 7:55 AM,  <[email protected]> wrote:
>
> Dear RDkitters,
>
> I'm trying to use Python's multiprocessing module in conjunction with
> RDKit.
>
> It should be applied in 2 cases:
> (1) fingerprint calculation
> &
> (2) Picking Diverse Molecules
>
>
>
> (1)
> "
> from multiprocessing import Pool
> p4 = Pool(processes=4)
> def fps_calc(m):
>        fps = [GetMorganFingerprint(x,3) for x in m]
>        return fps
> fps =  p4.map(fps_calc,ms)
> "
> ==>
> "TypeError: 'Mol' object is not iterable"

I think what you want to do here is:

#-------------------
def fps_calc(m):
        fp = GetMorganFingerprint(m,3)
        return fp

fps =  p4.map(fps_calc,ms)
#-------------------

The map method takes a function and a sequence of objects, it applies
that function to each object in the sequence.

> (2)
> "
> from multiprocessing import Pool
> p4 = Pool(processes=4)
> def distij(i,j,fps=fps):
>        return 1-DataStructs.DiceSimilarity(fps[i],fps[j])
>
> def DivSelection(distij,nfps,quantity_train):
>        picker = MaxMinPicker()
>        picked_indices = picker.LazyPick(distij,nfps,quantity_train)
>        return picked_indices
> "
> pickIndices = p4.map(DivSelection, ???)

The MaxMinPicker does not have any way to do the parallelization with
multiprocessing.


-greg

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to