Hi Olivier.
Thanks for your suggestions.
It certainly seems easier to directly use IPython but I agree with
Gael about not wanting to add additional dependencies.

I'll try doing it in joblib and if that is to hard, I'll try doing it
directly in sklearn.

Let's see how this goes!

Cheers,
Andy

On 01/27/2012 04:58 PM, Olivier Grisel wrote:
> 2012/1/27 Andreas<[email protected]>:
>
> I would advise you to start by experimenting with your own version of
> GridSearchCV (by deriving from the version of sklearn) and passing a
> LoadBalancedView instance as argument to the constructor and use it in
> the fit method instead of calling joblib.
>
> The same could be followed for the ensemble meta-estimators.
>
> If you can get something working in an efficient way on your cluster,
> put this on a gist and send an email on the mailing list and we will
> discuss how to best factorize this.
>
> This might be done by extending joblib to be able to deal with
> distributed infrastructure, or this could be done at sklearn level by
> refactoring the existing classes to make them more pluggable with
> IPython.parallel or this could be done by starting a new github repo
> for scikit-learn-cluster or something to contribute utilities to train
> and evaluate sklearn models on a HPC or cloud cluster.
>
>
>    


------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to