Hi Aditya, The sampling is done with replacement with the default settings. Hence, you will get different dataset even though you sample same number (`X.shape[0]`) of datapoints.
Regards, Venkatachalam N. On Wed, Mar 11, 2020 at 11:14 AM aditya aggarwal < adityaselfeffici...@gmail.com> wrote: > With all the parameters set to default, (especially bootstrap and > max_samples), no of samples passed to each estimator is X.shape[0]. Doesn't > it account for all the instances in the dataset with calculated no. of > feature? Then how come only a subset is given to the estimator? > > On Wed, Mar 11, 2020 at 10:58 AM Brown J.B. via scikit-learn < > scikit-learn@python.org> wrote: > >> Regardless of the number of features, each DT estimator is given only a >> subset of the data. >> Each DT estimator then uses the features to derive decision rules for the >> samples it was given. >> With more trees and few examples, you might get similar or identical >> trees, but that is not the norm. >> >> Pardon brevity. >> J.B. >> >> 2020年3月11日(水) 14:11 aditya aggarwal <adityaselfeffici...@gmail.com>: >> >>> For RandomForestClassifier in sklearn >>> >>> max_features parameter gives the max no of features for split in random >>> forest which is sqrt(n_features) as default. If m is sqrt of n, then no of >>> combinations for DT formation is nCm. What if nCm is less than n_estimators >>> (no of decision trees in random forest)? >>> >>> *example:* For n = 7, max_features is 3, so nCm is 35, meaning 35 >>> unique combinations of features for decision trees. Now for n_estimators = >>> 100, will the remaining 65 trees have repeated combination of features? If >>> so, won't trees be correlated introducing bias in the answer? >>> >>> >>> Thanks >>> >>> Aditya Aggarwal >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn