Wrong! Apologies, I had a double loop in there.
Get a random sample of the training data
For I to n_estimators:
Build a tree – this involves a random sample of features and
thresholds for each feature in the training data sample at each node.
Use the rest of the training data, not in the sample, to
calculate the out-of-bag score.
I also edited a bit for clarity. Refer to Gilles Loope’s dissertation for
details.
__________________________________________________________________________________________
Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science
770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 |
[email protected]
From: scikit-learn
[mailto:[email protected]] On Behalf Of
Dale T Smith
Sent: Tuesday, September 13, 2016 8:24 AM
To: Scikit-learn user and developer mailing list
Subject: Re: [scikit-learn] is RandomForest random samples or random features?
⚠ EXT MSG:
Each tree is built using a random sample with replacement from the provided
training data. The data not in the sample is used to calculate the out-of-bag
score. The “bag” is the sampled data.
The “random” refers to several features of the algorithm, including random
sampling of features
So for each tree
Get a random sample of the training data
For I to n_estimators:
Build a tree – this involves a random sample of
features and thresholds for each feature in the sample at each node.
Use the rest of the training data, not in the
sample, to calculate the out-of-bag score
Random Forest already incorporates “random features”.
https://github.com/glouppe/phd-thesis
__________________________________________________________________________________________
Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science
770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 |
[email protected]<mailto:[email protected]>
From: scikit-learn
[mailto:[email protected]] On Behalf Of ??
Sent: Tuesday, September 13, 2016 4:16 AM
To: [email protected]<mailto:[email protected]>
Subject: [scikit-learn] is RandomForest random samples or random features?
⚠ EXT MSG:
I have read the Guide of sklearn's RandomForest :
"""
In random forests (see
RandomForestClassifier<http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier>
and
RandomForestRegressor<http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor>
classes), each tree in the ensemble is built from a sample drawn with
replacement (i.e., a bootstrap sample) from the training set.
"""
But I prefer RandomForest as :
"""
features ("attributes", "predictors", "independent variables") are randomly
sampled
"""
is RandomForest random samples or random features? where can I find a features
random version of RandomForest?
thx.
* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening
attachments.
* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening
attachments.
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn