Re: [Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-25 Thread Joseph Turian
Chris Lin iirc has advocated partitioning the examples then concatenation the individual classifiers. You could do that and then do a second pass of learning: find the 1% of examples that are the hardest for the ensemble and learn over them. Regardless, it will be adhoc unless you use an out

Re: [Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-25 Thread Olivier Grisel
2012/9/24 Joseph Turian jos...@metaoptimize.com: Chris Lin iirc has advocated partitioning the examples then concatenation the individual classifiers. You could do that and then do a second pass of learning: find the 1% of examples that are the hardest for the ensemble and learn over them.

Re: [Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-24 Thread Christian Jauvin
Thank you Olivier for these suggestions. I'd try/test them with pleasure, but meanwhile I discovered that there was just no way the dataset I was trying to use would ever fit in the 72GB of memory of the machine I'm using. So I just scaled it down, and obviously this error is not happening

Re: [Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-24 Thread Olivier Grisel
I think @glouppe is likely to contribute some evolution for the ensembles of trees models once he gets back from ECML 2012 where he has a paper on those issues. -- Live Security Virtual Conference Exclusive live event will

[Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-22 Thread Christian Jauvin
Hi, I have been doing multiple experiments using a RandomForestClassifier (trained with the parallel code option) recently, without encountering any particular problem. However as soon as I began using a much bigger dataset (with the exact same code), I got this threading error: Exception in