stupid question but, have you used munos/nagios/similar to see if there is
anything that suddenly is hitting 0 or ~99.99% w.r.t availability /
utilization?


On Tue, May 27, 2014 at 6:31 PM, Kyle Kastner <[email protected]> wrote:

> One thought - who ELSE is using your cluster now? Maybe there is another
> task which is playing mean with system level memory access? I have seen
> some weird things when multiple heavy-duty processes *should* be playing
> nice, that also silenced the canaries in my scripts.
>
> One useful thing (which your sysadmin may already be doing!) is to use
> rrdtool to log/watch system level behavior and the behavior of your task as
> it runs. Sometimes interesting anomalies can pinpoint a problem you didn't
> see before.
>
> Does this happen with other regressors (SGD specifically)? Or only Ridge?
>
> Another idea, that totally avoids this diagnosis procedure, would be to do
> PCA or another dimensionality reduction (if you haven't already done so) to
> reduce the 800 dimensions down to ~100, which seemed to work well for you.
> You do lose some information, but you could probably look at the explained
> variance to make sure that the 100 reduced components describe a large
> amount (ideally 90+%) of your variance.
>
> That said, at 94GB already, buying more RAM is probably not the choice :)
>
>
> On Tue, May 27, 2014 at 5:16 PM, Chris Holdgraf <[email protected]>wrote:
>
>> So, the strange thing about this is that I've definitely run regressions
>> with larger matrices in the past, and haven't had issues before. This is on
>> a cluster with ~94 gigs of ram, and in the past I've exceeded this limit
>> and it has usually thrown an error (one of our sysadmin's scripts), not
>> silently hung.
>>
>> Chris
>>
>> From: Kyle Kastner <[email protected]>
>>> To: [email protected]
>>> Cc:
>>> Date: Tue, 27 May 2014 15:48:20 -0500
>>> Subject: Re: [Scikit-learn-general] Anyone experience hanging when
>>> parallelizing fits?
>>> What is your overall memory usage like when this happens? Sounds like
>>> classic memory swapping/thrashing to me - what are your system specs?
>>> One quick thing to try might be to change the dtype of the matrices to
>>> save some space. float32 vs float64 can make a large memory difference if
>>> you don't need double precision. Also as far as I know, sklearn/joblib
>>> doesn't do any kind of scheduling or optimization based on available
>>> resources, though someone may correct me here. This means that if required
>>> memory to run n jobs is >> than your system memory, very bad things (TM)
>>> will happen
>>>
>>>
>>> --
>>> _____________________________________
>>> PhD Candidate in Neuroscience | UC Berkeley
>>> <http://hwni.org/>Editor and Web Master | Berkeley Science Review
>>>  <http://sciencereview.berkeley.edu/>
>>> _____________________________________
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> The best possible search technologies are now affordable for all
>> companies.
>> Download your FREE open source Enterprise Search Engine today!
>> Our experts will assist you in its installation for $59/mo, no commitment.
>> Test it for FREE on our Cloud platform anytime!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> The best possible search technologies are now affordable for all companies.
> Download your FREE open source Enterprise Search Engine today!
> Our experts will assist you in its installation for $59/mo, no commitment.
> Test it for FREE on our Cloud platform anytime!
>
> http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
The best possible search technologies are now affordable for all companies.
Download your FREE open source Enterprise Search Engine today!
Our experts will assist you in its installation for $59/mo, no commitment.
Test it for FREE on our Cloud platform anytime!
http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to