date:20160807

[scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu

Hello, I am interested in scaling grid searches on an HPC LSF cluster with about 60 nodes, each with 20 cores. I thought i could just set n_jobs=1000 then submit a job with bsub -n 1000, but then I dug deeper and understood that the underlying joblib used by scikit-learn will create all of those j

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread federico vaggi

This might be interesting to you: http://blaze.pydata.org/blog/2015/10/19/dask-learn/ On Sun, 7 Aug 2016 at 10:42 Vlad Ionescu wrote: > Hello, > > I am interested in scaling grid searches on an HPC LSF cluster with about > 60 nodes, each with 20 cores. I thought i could just set n_jobs=1000 th

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu

Thanks, that looks interesting. I've looked into dask-learn's grid search ( https://github.com/mrocklin/dask-learn/blob/master/grid_search.py) but it seems not to make use of the n_jobs parameter. Will this work in a distributed fashion? The link you gave seemed to focus more on optimizing the grid

[scikit-learn] Disable Travis Cache

2016-08-07 Thread Raghav R V

Could someone disable the Travis cache once and for all please? I have seen several frustrating incidents where the Travis fails the PR because of this caching of old files. I also don't understand why it is enabled in the first place. It would really be super helpful if it is disabled for good.

Re: [scikit-learn] Disable Travis Cache

2016-08-07 Thread Alexandre Gramfort

hi, I just flushed all the caches. HTH Alex On Sun, Aug 7, 2016 at 2:39 PM, Raghav R V wrote: > Could someone disable the Travis cache once and for all please? > > I have seen several frustrating incidents where the Travis fails the PR > because of this caching of old files. > > I also don't un

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Gael Varoquaux

Parallel computing in scikit-learn is built upon on joblib. In the development version of scikit-learn, the included joblib can be extended with a distributed backend: http://distributed.readthedocs.io/en/latest/joblib.html that can distribute code on a cluster. This is still bleeding edge, but th

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu

I copy pasted the example in the link you gave, only made the search take a longer time. I used dask-ssh to setup worker nodes and a scheduler, then connected to the scheduler in my code. Tweaking the n_jobs parameters for the randomized search does not get any performance benefits. The connection

Re: [scikit-learn] Disable Travis Cache

2016-08-07 Thread Andreas Mueller

Why do you think it should be disabled instead of fixed? On 08/07/2016 08:39 AM, Raghav R V wrote: Could someone disable the Travis cache once and for all please? I have seen several frustrating incidents where the Travis fails the PR because of this caching of old files. I also don't under

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Gael Varoquaux

My guess is that your model evaluations are too fast, and that you are not getting the benefits of distributed computing as the overhead is hiding them. Anyhow, I don't think that this is ready for prime-time usage. It probably requires tweeking and understanding the tradeoffs. G On Sun, Aug 07,

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu

I don't think they're too fast. I tried with slower models and bigger data sets as well. I get the best results with n_jobs=20, which is the number of cores on a single node. Anything below is considerably slower, anything above is mostly the same, sometimes a little slower. Is there a way to see

[scikit-learn] Scaling model selection on a cluster

Re: [scikit-learn] Scaling model selection on a cluster

Re: [scikit-learn] Scaling model selection on a cluster

[scikit-learn] Disable Travis Cache

Re: [scikit-learn] Disable Travis Cache

Re: [scikit-learn] Scaling model selection on a cluster

Re: [scikit-learn] Scaling model selection on a cluster

Re: [scikit-learn] Disable Travis Cache

Re: [scikit-learn] Scaling model selection on a cluster

Re: [scikit-learn] Scaling model selection on a cluster

10 matches

Site Navigation

Mail list logo

Footer information