Re: [scikit-learn] Understanding max_features parameter in RandomForestClassifier

aditya aggarwal Tue, 10 Mar 2020 22:45:23 -0700

With all the parameters set to default, (especially bootstrap and
max_samples), no of samples passed to each estimator is X.shape[0]. Doesn't
it account for all the instances in the dataset with calculated no. of
feature? Then how come only a subset is given to the estimator?


On Wed, Mar 11, 2020 at 10:58 AM Brown J.B. via scikit-learn <
[email protected]> wrote:

> Regardless of the number of features, each DT estimator is given only a
> subset of the data.
> Each DT estimator then uses the features to derive decision rules for the
> samples it was given.
> With more trees and few examples, you might get similar or identical
> trees, but that is not the norm.
>
> Pardon brevity.
> J.B.
>
> 2020年3月11日(水) 14:11 aditya aggarwal <[email protected]>:
>
>> For RandomForestClassifier in sklearn
>>
>> max_features parameter gives the max no of features for split in random
>> forest which is sqrt(n_features) as default. If m is sqrt of n, then no of
>> combinations for DT formation is nCm. What if nCm is less than n_estimators
>> (no of decision trees in random forest)?
>>
>> *example:* For n = 7, max_features is 3, so nCm is 35, meaning 35 unique
>> combinations of features for decision trees. Now for n_estimators = 100,
>> will the remaining 65 trees have repeated combination of features? If so,
>> won't trees be correlated introducing bias in the answer?
>>
>>
>> Thanks
>>
>> Aditya Aggarwal
>> _______________________________________________
>> scikit-learn mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> _______________________________________________
> scikit-learn mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/scikit-learn
>

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Understanding max_features parameter in RandomForestClassifier

Reply via email to