Thank you Andreas!
On Sat, Apr 27, 2013 at 2:03 PM, Andreas Mueller
wrote:
> Hi Youssef.
> I would strongly advise you to use a image specific random forest
> implementation.
> There is a very good implementation by some other MSRC people:
>
> http://research.microsoft.com/en-us/downloads/03e0c
Hi Youssef.
I would strongly advise you to use a image specific random forest
implementation.
There is a very good implementation by some other MSRC people:
http://research.microsoft.com/en-us/downloads/03e0ca05-8aa9-49f6-801f-bb23846dc147/
It implements a much more complicated model, decision t
Thank you Peter, I found that the feature extraction was taking a lot of
extra memory and that was not related to wiseRF, so you were right.
Actually, from "top" it seems the training part was taking only an extra
20% of memory than the size of the dataset itself, wich is pretty
impressive. So at t
I've tried larger data sets. It wasn't pretty, much fewer features though
On Apr 25, 2013 4:03 AM, "Peter Prettenhofer"
wrote:
> Hi Youssef,
>
> please make sure that you use the latest version of sklearn (>= 0.13) - we
> did some enhancements to the sub-sampling procedure lately.
>
> Looking at
2013/4/25 Youssef Barhomi
>
> thank you very much Peter,
>
> you are right about the n_jobs, something was going wrong with that. When
> n_jobs = -1, for larger dataset (1E6 for this case), no cpu was being used
> and the process was hanging for a while. getting n_jobs = 1 made everything
> work.
Hi Brian,
thanks for your feedback. were you able to reproduce their results? how big
was your dataset that you have processed so far with an RF?
the MS people have used a distributed RF, so yes, the features I am
guessing were being computed in parallel on all these cores. Though, I am
still new t
ohh makes total sense now!! thank you Gilles!!
Y
On Thu, Apr 25, 2013 at 2:38 AM, Gilles Louppe wrote:
> Hi Youssef,
>
> Regarding memory usage, you should know that it'll basically blow up if
> you increase the number of jobs. With the current implementation, you'll
> need O(n_jobs * |X| * 2)
thank you very much Peter,
you are right about the n_jobs, something was going wrong with that. When
n_jobs = -1, for larger dataset (1E6 for this case), no cpu was being used
and the process was hanging for a while. getting n_jobs = 1 made everything
work.
yes, I will look into the iPython parall
Hi Youssef,
please make sure that you use the latest version of sklearn (>= 0.13) - we
did some enhancements to the sub-sampling procedure lately.
Looking at the RandomForest code - it seems that the jobs=-1 should not be
the issue for the parallel training of the trees since ``n_jobs =
min(cpu_c
Hi Youssef,
Regarding memory usage, you should know that it'll basically blow up if you
increase the number of jobs. With the current implementation, you'll need
O(n_jobs * |X| * 2) in memory space (where |X| is the size of X, in bytes).
That issue stems from the use of joblib which basically forc
Hi Youssef,
You're trying to do exactly what I did. First thing to note is that the
Microsoft guys don't precompute the features, rather they compute them on
the fly. That means that they only need enough memory to store the depth
images, and since they have a 1000 core cluster, computing the feat
11 matches
Mail list logo