Re: [Scikit-learn-general] random forests - number of samples

2015-03-11 Thread Andreas Mueller
By default bootstrap=True, so a bootstrap sample is used. That means the number of samples is the same as the original number of samples, but only about 2/3 of the dataset is used, the rest are duplicates. For efficiency, the samples are actually represented using sample weights. On 03/11/2015

[Scikit-learn-general] random forests - number of samples

2015-03-11 Thread Luca Puggini
this can help Message: 4 > Date: Wed, 11 Mar 2015 15:22:29 + > From: "Pagliari, Roberto" > Subject: [Scikit-learn-general] random forests - number of samples > To: "[email protected]" > > Message-ID: > < >

[Scikit-learn-general] random forests - number of samples

2015-03-11 Thread Pagliari, Roberto
How many samples does a single tree of a random use? Or does it use all samples? -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media,

Re: [Scikit-learn-general] random forests with njobs>1

2015-02-27 Thread Gael Varoquaux
You don't need a separate install of joblib, as it is shipped with scikit-learn. RF use a threading strategy for parallel computing, rather than a multi process strategy. What matters is not the number of processes that you are seeing, but the CPU usage. Are you seeing that your CPUs are being u

Re: [Scikit-learn-general] random forests with njobs>1

2015-02-27 Thread Pagliari, Roberto
Hi , Yes it works with grid search. I’m checking using the ‘top’ command From: Artem [mailto:[email protected]] Sent: Friday, February 27, 2015 5:14 PM To: [email protected] Subject: Re: [Scikit-learn-general] random forests with njobs>1 Do you have job

Re: [Scikit-learn-general] random forests with njobs>1

2015-02-27 Thread Artem
Do you have joblib installed? Does n_jobs > 1 work with other algorithms? On Sat, Feb 28, 2015 at 12:55 AM, Pagliari, Roberto wrote: > When using random forests with njobs > 1, I see one python process only. > Does RF support using multiprocessor module? > > > > > ---

Re: [Scikit-learn-general] random forests with njobs>1

2015-02-27 Thread Sebastian Raschka
Yes, it should. When I used it ~ 2 months ago I was running it on 16 processors I can remember... When I did a `checkjob` it was utilizing most of them most of the time. > On Feb 27, 2015, at 4:55 PM, Pagliari, Roberto > wrote: > > When using random forests with njobs > 1, I see one python

[Scikit-learn-general] random forests with njobs>1

2015-02-27 Thread Pagliari, Roberto
When using random forests with njobs > 1, I see one python process only. Does RF support using multiprocessor module? -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and develope

Re: [Scikit-learn-general] Random forests: Measuring information gain in multi-output

2013-02-04 Thread Ribonous
Thank you Peter and Andreas. That makes sense. Thanks for responding so quickly. Lukas On Mon, Feb 4, 2013 at 10:40 AM, Ribonucleico wrote: > > > -- Forwarded message -- > From: Peter Prettenhofer > Date: Mon, Feb 4, 2013 at 10:08 AM > Subject: Re: [Sc

Re: [Scikit-learn-general] Random forests: Measuring information gain in multi-output

2013-02-04 Thread Peter Prettenhofer
Hi Lukas, the impurity (in your case entropy) is simply averaged over all outputs - see [1] - the code is written in cython (a python dialect that translates to C). best, Peter [1] https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx#L1482 2013/2/4 Ribonous : > I th

Re: [Scikit-learn-general] Random forests: Measuring information gain in multi-output

2013-02-04 Thread Andreas Mueller
On 02/04/2013 04:02 PM, Ribonous wrote: I think I understand how a random forest classifier works in the univariate case. Unfortunately I haven't found much information about how to implement random forest classifier in the multi-output case. How does the random forest classifier in sklearn

[Scikit-learn-general] Random forests: Measuring information gain in multi-output

2013-02-04 Thread Ribonous
I think I understand how a random forest classifier works in the univariate case. Unfortunately I haven't found much information about how to implement random forest classifier in the multi-output case. How does the random forest classifier in sklearn measure the information gain for a given split

Re: [Scikit-learn-general] random forests

2011-10-31 Thread Olivier Grisel
2011/11/1 Satrajit Ghosh : > hi pat, > yes. take a look at: > https://github.com/bdholt1/scikit-learn/tree/enh/ensemble > cheers, You can also follow the discussion around those development on this pull request: https://github.com/scikit-learn/scikit-learn/pull/385 -- Olivier http://twitter.com

Re: [Scikit-learn-general] random forests

2011-10-31 Thread Satrajit Ghosh
hi pat, yes. take a look at: https://github.com/bdholt1/scikit-learn/tree/enh/ensemble cheers, satra On Mon, Oct 31, 2011 at 7:03 PM, Patrick Brooks wrote: > Hi, > > It looks like things are coming along with tree-based classifiers and > regressors- I was wondering if anyone has been thinki

[Scikit-learn-general] random forests

2011-10-31 Thread Patrick Brooks
Hi, It looks like things are coming along with tree-based classifiers and regressors- I was wondering if anyone has been thinking about using these to implement some basic random forest functionality? I would be willing to dedicate some time, but I would like to collaborate with someone if possibl