hi andreas,
a few notes:
- a sprint planned for pycon will be looking at parallel computing with
scikit-learn and ipython (http://wiki.ipython.org/PyCon12Sprint)
- ipython currently uses a grab an engine and not release strategy in the
context of distributed systems like SGE/PBS/LSF. this implies that the load
distribution happens at engine instantiation time not at execution time.
depending on your cluster this may be a positive or a negative thing.
- in nipype we do distributed computing by offering the ability to use
ipython as the point of distribution or directly interfacing with the
cluster engine. here is the ipython plugin:
https://github.com/nipy/nipype/blob/master/nipype/pipeline/plugins/ipythonxi.py
- there is also a python library called the soma workflow that offers a
python interface to distributed computing using drmaa.
the key decision point for which route will depend on how the data gets to
the compute node (whether by files, or pickling, or shared memory), whether
the file system is shared or whether the data movement is done between
processes.
cheers,
satra
On Fri, Jan 27, 2012 at 9:44 AM, Andreas <[email protected]> wrote:
> Hi everybody.
> This question basically goes out to Gael, but might also be interesting
> for others.
> I am using sklearn on an SGE cluster at the moment and it is not as nice
> as it could be.
> So I was wondering whether there would be a non-intrusive way to make
> sklearn
> parallelize over the cluster.
> At the moment all parallelism is handled by joblib. On the other hand it
> seems
> IPython can talk to the SGE scheduling.
> So I would love to have a way for joblib to talk to IPython.
>
> Is there an easy way to make this possible?
> I was thinking about monkey-patching the Parallel class to use
> "LoadBalancedView" from IPython.
> Do you think this is feasible?
>
> Another question is whether there are additional assumptions made
> by sklearn about the way the parallelism works.
> IPython basically provides a "map" interface similar to "Parallel",
> so I would hope that there are no problems. Do you think there will be?
>
> Any help would be welcome.
>
> If I actually get this to work, I feel this might be quite a success
> story for sklearn ;)
>
> Cheers,
> Andy
>
>
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general