2012/1/27 Andreas <[email protected]>:

I would advise you to start by experimenting with your own version of
GridSearchCV (by deriving from the version of sklearn) and passing a
LoadBalancedView instance as argument to the constructor and use it in
the fit method instead of calling joblib.

The same could be followed for the ensemble meta-estimators.

If you can get something working in an efficient way on your cluster,
put this on a gist and send an email on the mailing list and we will
discuss how to best factorize this.

This might be done by extending joblib to be able to deal with
distributed infrastructure, or this could be done at sklearn level by
refactoring the existing classes to make them more pluggable with
IPython.parallel or this could be done by starting a new github repo
for scikit-learn-cluster or something to contribute utilities to train
and evaluate sklearn models on a HPC or cloud cluster.


-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to