@Sebastian: I have tried to run cross_validation by using n_jobs=1 and it
did not use SWAP memory, even the RAM usage was quite low (maximum 12%).
However, this will take a longer time to finish. Any idea what to try now?

Thanks
Kindest Regards
Waseem

On Fri, Feb 12, 2016 at 9:58 PM, Jacob Schreiber <jmschreibe...@gmail.com>
wrote:

> I don't think that the data is copied for tree based classifiers. It uses
> the threading backend, so each thread should be sharing memory.
>
> On Fri, Feb 12, 2016 at 12:32 PM, Sebastian Raschka <se.rasc...@gmail.com>
> wrote:
>
>> I'd suggest trying n_jobs=1 and check if swap memory is used (you don't
>> have to run it until completion). If this runs fine without swap, we can
>> work further from there.
>>
>> Sent from my iPhone
>>
>> On Feb 12, 2016, at 2:57 PM, muhammad waseem <m.waseem.ah...@gmail.com>
>> wrote:
>>
>> @Sebastian: I tried with n_jobs=10 (total is equal to 12) and it still
>> created the same problem. I could try running it by using n_jobs=1 but it
>> would be so slow that it will take ages to complete. The machine has 32GB
>> RAM and it started using Swap memory after consuming full RAM.
>>
>> Is there a way to tackle or you really think that all this k-fold cross
>> validation, training should be done using Spark's MLib?
>>
>> Thanks
>> Regards
>> Waseem
>>
>>
>> On Fri, Feb 12, 2016 at 6:40 PM, Sebastian Raschka <se.rasc...@gmail.com>
>> wrote:
>>
>>> Thanks for the note, Manoj, didn't know that!
>>>
>>> @muhammad So if there's no duplication of data across all processes, I
>>> guess that the you would also run into troubles with n_jobs=1. But just to
>>> make sure that data duplication is not an issue, could you try running it
>>> with n_jobs=1? In this case, probably only a smaller data set or machine
>>> with larger memory would help. Here, I'd probably think about using Spark's
>>> MLlib to deal with this particular dataset.
>>>
>>> On Feb 12, 2016, at 12:30 PM, muhammad waseem <m.waseem.ah...@gmail.com>
>>> wrote:
>>>
>>> Hi Sebastian and Manoj,
>>> @Manoj: What should be the value of max_nbytes parameter and will this
>>> affect the results and time it takes to run cross_validation, grid_search
>>> etc?
>>> @Sebastian: Will the Spark implication will also improve the memory use
>>> or just the CPU?
>>>
>>>
>>> Thanks
>>> Kindest Regards
>>>
>>> On Fri, Feb 12, 2016 at 5:29 PM, muhammad waseem <
>>> m.waseem.ah...@gmail.com> wrote:
>>>
>>>> Hi Sebastian and Manoj,
>>>> @Manoj: What should be the value of max_nbytes parameter and will this
>>>> affect the results and time it takes to run cross_validation, grid_search
>>>> etc?
>>>>
>>>> Thanks
>>>> Kindest Regards
>>>> Waseem
>>>>
>>>> On Fri, Feb 12, 2016 at 4:42 PM, Sebastian Raschka <
>>>> se.rasc...@gmail.com> wrote:
>>>>
>>>>> Hi, Waseem,
>>>>> I think lowering the value of n_jobs would help; as far as I know,
>>>>> each process get a copy of the data? Just stumbled upon spark-sklearn a 
>>>>> few
>>>>> days ago, maybe that could help as well:
>>>>>
>>>>>
>>>>> https://databricks.com/blog/2016/02/08/auto-scaling-scikit-learn-with-spark.html
>>>>>
>>>>> When I understand correctly, the data is still copied, but here, each
>>>>> node gets a copy instead of one machine with many copies.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> > On Feb 12, 2016, at 11:35 AM, muhammad waseem <
>>>>> m.waseem.ah...@gmail.com> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > I am trying to fit my model using regression trees but the problem
>>>>> is, it consumes a lot of RAM, which makes my code unresponsive. By looking
>>>>> at different forums and platforms, I think this is a common problem. I was
>>>>> wondering, how you free up memory or what are the best ways to run the
>>>>> fitting process/cross-validation without running out of memory? This
>>>>> problem is mostly with all regression trees (I think with other ML
>>>>> algorithms as well). Shall I try to run without n_job=-1 and use some 
>>>>> other
>>>>> value (e.g. n_jobs=10) in cross_validation?
>>>>> >
>>>>> > Thanks
>>>>> > Kindest Regards
>>>>> > Waseem
>>>>> >
>>>>> ------------------------------------------------------------------------------
>>>>> > Site24x7 APM Insight: Get Deep Visibility into Application
>>>>> Performance
>>>>> > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>>>> > Monitor end-to-end web transactions and take corrective actions now
>>>>> > Troubleshoot faster and improve end-user experience. Signup Now!
>>>>> >
>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________
>>>>> > Scikit-learn-general mailing list
>>>>> > Scikit-learn-general@lists.sourceforge.net
>>>>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>>>> Monitor end-to-end web transactions and take corrective actions now
>>>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>> Monitor end-to-end web transactions and take corrective actions now
>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>> Monitor end-to-end web transactions and take corrective actions now
>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to