Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Brown J.B. via scikit-learn
> As previously mentioned, a "weak learner" is just a learner that barely
performs better than random.

To continue with what the definition of a random learner refers to, does it
mean the following contexts?
(1) Classification: a learner which uniformly samples from one of the N
endpoints in the training data (e.g., the set of unique values in the
response vector "y").
(2) Regression: a learner which uniformly samples from the range of values
in the endpoint/response vector (e.g., uniform sampling from [min(y),
max(y)]).

Should even more context be explicitly declared (e.g., not uniform sampling
but any distribution sampler)?

J.B.
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Fernando Marcos Wittmann
In my opinion the reference is distorting a concept that has a consolidated
definition in the community. I am also familiar with the definition of WL
as "an estimator slightly better than guessing", mostly decision stumps (
https://en.m.wikipedia.org/wiki/Decision_stump), which is not an component
of RFs.

On Sun, Aug 16, 2020, 16:22 Nicolas Hug  wrote:

> As previously mentioned, a "weak learner" is just a learner that barely
> performs better than random. It's more common in the context of boosting,
> but I think weak learning predates boosting, and the original RF paper by
> Breiman does make reference to "weak learners":
>
> It's interesting that Forest-RI could produce error rates not far above
> the Bayeserror rate. The individual classifiers are weak. For F=1, the
> average tree errorrate is 80%; for F=10, it is 65%; and for F=25, it is
> 60%. Forests seem to have theability to work with very weak classifiers
> as long as their correlation is low
>
> Nicolas
>
>
> On 8/16/20 2:29 PM, Guillaume Lemaître wrote:
>
> One needs to define what is the definition of weak learner.
>
> In boosting, if I recall well the literature, weak learner refers to
> learner which unfit performing slightly better than a random learner. In
> this regard, a tree with shallow depth will be a weak learner and is used
> in adaboost or gradient boosting.
>
> However, in random forest the tree used are trees that overfit (deep tree)
> so they are not weak for the same reason. However, one will never be able
> to do what a forest will do with a single tree. In this regard, a single
> tree is weaker than the forest. However, I never read the term for "weak
> learner" in the context of the random forest.
>
> Sent from my phone - sorry to be brief and potential misspell.
> *From:* fernando.wittm...@gmail.com
> *Sent:* 16 August 2020 20:06
> *To:* scikit-learn@python.org
> *Reply to:* scikit-learn@python.org
> *Subject:* [scikit-learn] Opinion on reference mentioning that RF uses
> weak learners
>
> Hello guys,
>
> The the following reference states that Random Forests uses weak learners:
> -
> https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner
>
> The random forest starts with a standard machine learning technique called
>> a “decision tree” which, in ensemble terms, corresponds to our weak learner.
>
> ...
>
>  Thus, in ensemble terms, the trees are weak learners and the random
>> forest is a strong learner.
>
>
> I completely disagree with that statement. But I would like the opinion of
> the community to double check if I am not missing something.
>
>
> ___
> scikit-learn mailing 
> listscikit-learn@python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Nicolas Hug
As previously mentioned, a "weak learner" is just a learner that barely 
performs better than random. It's more common in the context of 
boosting, but I think weak learning predates boosting, and the original 
RF paper by Breiman does make reference to "weak learners":


It's interesting that Forest-RI could produce error rates not far 
above the Bayeserror rate. The individual classifiers are weak. For 
F=1, the average tree errorrate is 80%; for F=10, it is 65%; and for 
F=25, it is 60%. Forests seem to have theability to work with very 
weak classifiers as long as their correlation is low


Nicolas


On 8/16/20 2:29 PM, Guillaume Lemaître wrote:

One needs to define what is the definition of weak learner.

In boosting, if I recall well the literature, weak learner refers to 
learner which unfit performing slightly better than a random learner. 
In this regard, a tree with shallow depth will be a weak learner and 
is used in adaboost or gradient boosting.


However, in random forest the tree used are trees that overfit (deep 
tree) so they are not weak for the same reason. However, one will 
never be able to do what a forest will do with a single tree. In this 
regard, a single tree is weaker than the forest. However, I never read 
the term for "weak learner" in the context of the random forest.


Sent from my phone - sorry to be brief and potential misspell.

*From:* fernando.wittm...@gmail.com
*Sent:* 16 August 2020 20:06
*To:* scikit-learn@python.org
*Reply to:* scikit-learn@python.org
*Subject:* [scikit-learn] Opinion on reference mentioning that RF uses 
weak learners



Hello guys,

The the following reference states that Random Forests uses weak learners:
- 
https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner


The random forest starts with a standard machine learning
technique called a “decision tree” which, in ensemble terms,
corresponds to our weak learner.

... 


Thus, in ensemble terms, the trees are weak learners and the
random forest is a strong learner.


I completely disagree with that statement. But I would like the 
opinion of the community to double check if I am not missing something.


___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Guillaume Lemaître
  One needs to define what is the definition of weak learner.In boosting, if I recall well the literature, weak learner refers to learner which unfit performing slightly better than a random learner. In this regard, a tree with shallow depth will be a weak learner and is used in adaboost or gradient boosting. However, in random forest the tree used are trees that overfit (deep tree) so they are not weak for the same reason. However, one will never be able to do what a forest will do with a single tree. In this regard, a single tree is weaker than the forest. However, I never read the term for "weak learner" in the context of the random forest.  Sent from my phone - sorry to be brief and potential misspell.From: fernando.wittm...@gmail.comSent: 16 August 2020 20:06To: scikit-learn@python.orgReply to: scikit-learn@python.orgSubject: [scikit-learn] Opinion on reference mentioning that RF uses weak learners  Hello guys, The the following reference states that Random Forests uses weak learners:- https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learnerThe random forest starts with a standard machine learning technique called a “decision tree” which, in ensemble terms, corresponds to our weak learner  Thus, in ensemble terms, the trees are weak learners and the random forest is a strong learner.I completely disagree with that statement. But I would like the opinion of the community to double check if I am not missing something.  
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Matthieu Brucher
Hi,

What are you wondering?
The individual tree is weakened by design (accepts more errors), so
indeed, the individual trees are weak learners and the combination of
them (the forest) becomes the strong learner.
You can have a strong tree as well (deeper, more parameters), but
that's not what is searched in a random forest.

Cheers,

Matthieu

Le dim. 16 août 2020 à 19:06, Fernando Marcos Wittmann
 a écrit :
>
> Hello guys,
>
> The the following reference states that Random Forests uses weak learners:
> - 
> https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner
>
>> The random forest starts with a standard machine learning technique called a 
>> “decision tree” which, in ensemble terms, corresponds to our weak learner.
>>
>> ...
>>
>>  Thus, in ensemble terms, the trees are weak learners and the random forest 
>> is a strong learner.
>
>
> I completely disagree with that statement. But I would like the opinion of 
> the community to double check if I am not missing something.
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn



-- 
Quantitative researcher, Ph.D.
Blog: http://blog.audio-tk.com/
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Fernando Marcos Wittmann
Hello guys,

The the following reference states that Random Forests uses weak learners:
-
https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner

The random forest starts with a standard machine learning technique called
> a “decision tree” which, in ensemble terms, corresponds to our weak learner.

...

 Thus, in ensemble terms, the trees are weak learners and the random forest
> is a strong learner.


I completely disagree with that statement. But I would like the opinion of
the community to double check if I am not missing something.
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn