Re: [Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Ronnie Ghose
hmm this is going to be annoying, but have you debugged it at all e.g. have you tried seeing if you can find where it is at when it hits that 0 cpu? / try printing something out middway and seeing where it just stops? On Wed, May 28, 2014 at 12:56 AM, Gael Varoquaux < [email protected]

Re: [Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Gael Varoquaux
On Tue, May 27, 2014 at 03:16:59PM -0700, Chris Holdgraf wrote: > So, the strange thing about this is that I've definitely run > regressions with larger matrices in the past, and haven't had issues > before. This is on a cluster with ~94 gigs of ram, and in the past I've > exceeded this limit and i

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 52, Issue 37

2014-05-27 Thread Chris Holdgraf
Ha, no, buying more RAM would probably not be viable (I want to avoid starting a war with my sysadmin). I do think that memory is the issue here. I ran this code via the command line and noticed that it was throwing this error: Exception in thread Thread-3: > Traceback (most recent call last): >

Re: [Scikit-learn-general] Python notebooks for statlearning course exercises

2014-05-27 Thread Fernando Perez
On Sun, May 25, 2014 at 9:04 AM, Gael Varoquaux < [email protected]> wrote: > That's great! > Indeed, thrilled to see this! > I've given you a bit more added publicity: > https://twitter.com/GaelVaroquaux/status/470596202665623552 > Duly retweeted :) Sujit, could I suggest that y

Re: [Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Ronnie Ghose
stupid question but, have you used munos/nagios/similar to see if there is anything that suddenly is hitting 0 or ~99.99% w.r.t availability / utilization? On Tue, May 27, 2014 at 6:31 PM, Kyle Kastner wrote: > One thought - who ELSE is using your cluster now? Maybe there is another > task whic

Re: [Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Kyle Kastner
One thought - who ELSE is using your cluster now? Maybe there is another task which is playing mean with system level memory access? I have seen some weird things when multiple heavy-duty processes *should* be playing nice, that also silenced the canaries in my scripts. One useful thing (which you

Re: [Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Chris Holdgraf
So, the strange thing about this is that I've definitely run regressions with larger matrices in the past, and haven't had issues before. This is on a cluster with ~94 gigs of ram, and in the past I've exceeded this limit and it has usually thrown an error (one of our sysadmin's scripts), not silen

Re: [Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Kyle Kastner
where very bad things (TM) is big matrices constantly swapping places with each other, so no computing can actually happen. Another thing to try might be numpy.memmap - I have seen some very great savings on parallel, large matrix work using it (since the tasks I was doing only needed one sample a

Re: [Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Kyle Kastner
What is your overall memory usage like when this happens? Sounds like classic memory swapping/thrashing to me - what are your system specs? One quick thing to try might be to change the dtype of the matrices to save some space. float32 vs float64 can make a large memory difference if you don't nee

[Scikit-learn-general] Anyone experience hanging when parallelizing fits?

2014-05-27 Thread Chris Holdgraf
In particular, it seems that when I've got matrices which are too big, the forked processes will hang and never finish (aka, they take up 0 computing time and remain that way indefinitely). In particular, I've noticed this problem when using cross_val_score with Ridge regression. This isn't a prob

Re: [Scikit-learn-general] Digit recognition

2014-05-27 Thread klo uo
Hi, Andy I didn't do grid-search. I was just assuming that grid-search returns optimal parameters for fitting. I may be easily wrong. So I just used your parameters. But I stopped the fitting process. I'll try with smaller subset from MNIST this weekend, as I got some assignments to do, while this

Re: [Scikit-learn-general] Digit recognition

2014-05-27 Thread Andy
Hi Klo. Actually back then I was using libSVM directly, not scikit-learn, which I hadn't discovered yet ;) Also, if you actually want to do that grid-search, it might take for ever. Why do you want to do the grid-search on MNIST again? Andy On 05/25/2014 07:23 AM, klo uo wrote: Hi Caleb,