[scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu
Hello, I am interested in scaling grid searches on an HPC LSF cluster with about 60 nodes, each with 20 cores. I thought i could just set n_jobs=1000 then submit a job with bsub -n 1000, but then I dug deeper and understood that the underlying joblib used by scikit-learn will create all of those j

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu
node. Will look into it some more. On Sun, Aug 7, 2016 at 12:06 PM federico vaggi wrote: > This might be interesting to you: > > http://blaze.pydata.org/blog/2015/10/19/dask-learn/ > > > On Sun, 7 Aug 2016 at 10:42 Vlad Ionescu wrote: > >> Hello, >> >>

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu
I copy pasted the example in the link you gave, only made the search take a longer time. I used dask-ssh to setup worker nodes and a scheduler, then connected to the scheduler in my code. Tweaking the n_jobs parameters for the randomized search does not get any performance benefits. The connection

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-07 Thread Vlad Ionescu
It > probably requires tweeking and understanding the tradeoffs. > > G > > On Sun, Aug 07, 2016 at 09:25:47PM +, Vlad Ionescu wrote: > > I copy pasted the example in the link you gave, only made the search > take a > > longer time. I used dask-ssh to setup worker node

Re: [scikit-learn] Scaling model selection on a cluster

2016-08-08 Thread Vlad Ionescu
of them mention how you can ensure or check that each worker is doing work either. If there's anything I can do to help debug this (I realize it could be a problem on my end though), please let me know. On Mon, Aug 8, 2016 at 9:48 AM Vlad Ionescu wrote: > I don't think they'r