Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners
> As previously mentioned, a "weak learner" is just a learner that barely performs better than random. To continue with what the definition of a random learner refers to, does it mean the following contexts? (1) Classification: a learner which uniformly samples from one of the N endpoints in the training data (e.g., the set of unique values in the response vector "y"). (2) Regression: a learner which uniformly samples from the range of values in the endpoint/response vector (e.g., uniform sampling from [min(y), max(y)]). Should even more context be explicitly declared (e.g., not uniform sampling but any distribution sampler)? J.B. ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners
In my opinion the reference is distorting a concept that has a consolidated definition in the community. I am also familiar with the definition of WL as "an estimator slightly better than guessing", mostly decision stumps ( https://en.m.wikipedia.org/wiki/Decision_stump), which is not an component of RFs. On Sun, Aug 16, 2020, 16:22 Nicolas Hug wrote: > As previously mentioned, a "weak learner" is just a learner that barely > performs better than random. It's more common in the context of boosting, > but I think weak learning predates boosting, and the original RF paper by > Breiman does make reference to "weak learners": > > It's interesting that Forest-RI could produce error rates not far above > the Bayeserror rate. The individual classifiers are weak. For F=1, the > average tree errorrate is 80%; for F=10, it is 65%; and for F=25, it is > 60%. Forests seem to have theability to work with very weak classifiers > as long as their correlation is low > > Nicolas > > > On 8/16/20 2:29 PM, Guillaume Lemaître wrote: > > One needs to define what is the definition of weak learner. > > In boosting, if I recall well the literature, weak learner refers to > learner which unfit performing slightly better than a random learner. In > this regard, a tree with shallow depth will be a weak learner and is used > in adaboost or gradient boosting. > > However, in random forest the tree used are trees that overfit (deep tree) > so they are not weak for the same reason. However, one will never be able > to do what a forest will do with a single tree. In this regard, a single > tree is weaker than the forest. However, I never read the term for "weak > learner" in the context of the random forest. > > Sent from my phone - sorry to be brief and potential misspell. > *From:* fernando.wittm...@gmail.com > *Sent:* 16 August 2020 20:06 > *To:* scikit-learn@python.org > *Reply to:* scikit-learn@python.org > *Subject:* [scikit-learn] Opinion on reference mentioning that RF uses > weak learners > > Hello guys, > > The the following reference states that Random Forests uses weak learners: > - > https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner > > The random forest starts with a standard machine learning technique called >> a “decision tree” which, in ensemble terms, corresponds to our weak learner. > > ... > > Thus, in ensemble terms, the trees are weak learners and the random >> forest is a strong learner. > > > I completely disagree with that statement. But I would like the opinion of > the community to double check if I am not missing something. > > > ___ > scikit-learn mailing > listscikit-learn@python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > ___ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners
As previously mentioned, a "weak learner" is just a learner that barely performs better than random. It's more common in the context of boosting, but I think weak learning predates boosting, and the original RF paper by Breiman does make reference to "weak learners": It's interesting that Forest-RI could produce error rates not far above the Bayeserror rate. The individual classifiers are weak. For F=1, the average tree errorrate is 80%; for F=10, it is 65%; and for F=25, it is 60%. Forests seem to have theability to work with very weak classifiers as long as their correlation is low Nicolas On 8/16/20 2:29 PM, Guillaume Lemaître wrote: One needs to define what is the definition of weak learner. In boosting, if I recall well the literature, weak learner refers to learner which unfit performing slightly better than a random learner. In this regard, a tree with shallow depth will be a weak learner and is used in adaboost or gradient boosting. However, in random forest the tree used are trees that overfit (deep tree) so they are not weak for the same reason. However, one will never be able to do what a forest will do with a single tree. In this regard, a single tree is weaker than the forest. However, I never read the term for "weak learner" in the context of the random forest. Sent from my phone - sorry to be brief and potential misspell. *From:* fernando.wittm...@gmail.com *Sent:* 16 August 2020 20:06 *To:* scikit-learn@python.org *Reply to:* scikit-learn@python.org *Subject:* [scikit-learn] Opinion on reference mentioning that RF uses weak learners Hello guys, The the following reference states that Random Forests uses weak learners: - https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner The random forest starts with a standard machine learning technique called a “decision tree” which, in ensemble terms, corresponds to our weak learner. ... Thus, in ensemble terms, the trees are weak learners and the random forest is a strong learner. I completely disagree with that statement. But I would like the opinion of the community to double check if I am not missing something. ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners
One needs to define what is the definition of weak learner.In boosting, if I recall well the literature, weak learner refers to learner which unfit performing slightly better than a random learner. In this regard, a tree with shallow depth will be a weak learner and is used in adaboost or gradient boosting. However, in random forest the tree used are trees that overfit (deep tree) so they are not weak for the same reason. However, one will never be able to do what a forest will do with a single tree. In this regard, a single tree is weaker than the forest. However, I never read the term for "weak learner" in the context of the random forest. Sent from my phone - sorry to be brief and potential misspell.From: fernando.wittm...@gmail.comSent: 16 August 2020 20:06To: scikit-learn@python.orgReply to: scikit-learn@python.orgSubject: [scikit-learn] Opinion on reference mentioning that RF uses weak learners Hello guys, The the following reference states that Random Forests uses weak learners:- https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learnerThe random forest starts with a standard machine learning technique called a “decision tree” which, in ensemble terms, corresponds to our weak learner Thus, in ensemble terms, the trees are weak learners and the random forest is a strong learner.I completely disagree with that statement. But I would like the opinion of the community to double check if I am not missing something. ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners
Hi, What are you wondering? The individual tree is weakened by design (accepts more errors), so indeed, the individual trees are weak learners and the combination of them (the forest) becomes the strong learner. You can have a strong tree as well (deeper, more parameters), but that's not what is searched in a random forest. Cheers, Matthieu Le dim. 16 août 2020 à 19:06, Fernando Marcos Wittmann a écrit : > > Hello guys, > > The the following reference states that Random Forests uses weak learners: > - > https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner > >> The random forest starts with a standard machine learning technique called a >> “decision tree” which, in ensemble terms, corresponds to our weak learner. >> >> ... >> >> Thus, in ensemble terms, the trees are weak learners and the random forest >> is a strong learner. > > > I completely disagree with that statement. But I would like the opinion of > the community to double check if I am not missing something. > > ___ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Quantitative researcher, Ph.D. Blog: http://blog.audio-tk.com/ LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
[scikit-learn] Opinion on reference mentioning that RF uses weak learners
Hello guys, The the following reference states that Random Forests uses weak learners: - https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner The random forest starts with a standard machine learning technique called > a “decision tree” which, in ensemble terms, corresponds to our weak learner. ... Thus, in ensemble terms, the trees are weak learners and the random forest > is a strong learner. I completely disagree with that statement. But I would like the opinion of the community to double check if I am not missing something. ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn