Re: [Scikit-learn-general] RandomForestClassifier w/ IPython.parallel

2014-02-07 Thread Peter Prettenhofer
Hi Allessandro, you might want to look into this presentation by Olivier https://speakerdeck.com/ogrisel/growing-randomized-trees-in-the-cloud-1 -- it should be pretty much what you need. Code is here https://github.com/pydata/pyrallel. best, Peter 2014-02-07 23:28 GMT+01:00 Alessandro Gagliar

[Scikit-learn-general] RandomForestClassifier w/ IPython.parallel

2014-02-07 Thread Alessandro Gagliardi
Hi All, I want to run a large sklearn.ensemble.RandomForestClassifier (with maybe a dozens or maybe hundreds of trees and 100,000 samples). My desktop won’t handle this so I want to try using StarCluster. RandomForestClassifier seems to parallelize easily, but I don’t know how I would split it

Re: [Scikit-learn-general] Sparse matrix support for Decision tree implementation

2014-02-07 Thread Felipe Eltermann
Arnaud, I added a issparse attribute to Splitter base class. Doing so, I think I managed to introduce sparse support without the need of replicating Splitters' business logic code. I'm working on this branch [1] I have two questions: 1- I removed a "with nogil" statement [2]. Is there a way to ke