Re: [Scikit-learn-general] sample weights for RandomForestClassifier to compute cross_val_score with roc_auc metric

2015-04-26 Thread Arnaud Joly
If you set sample_weight[i] = 2, for the i-th samples. It will consider that this sample has to be accounted twice in the tree growing procedure (impurity computation, leaf labelling, …). Best regards, Arnaud > On 26 Apr 2015, at 16:00, Luca Puggini wrote: > > Ok thanks a lot, a last question

Re: [Scikit-learn-general] sample weights for RandomForestClassifier to compute cross_val_score with roc_auc metric

2015-04-26 Thread Luca Puggini
Ok thanks a lot, a last question. What is the role of sample_weight If I use ExtraTreesClassifier with bootstrap=False (this is the default)? Are they used during the splitting process? On Sat, Apr 25, 2015 at 10:04 PM, Andy wrote: > On 04/25/2015 09:18 AM, Luca Puggini wrote: > > I think it

Re: [Scikit-learn-general] sample weights for RandomForestClassifier to compute cross_val_score with roc_auc metric

2015-04-25 Thread Andy
On 04/25/2015 09:18 AM, Luca Puggini wrote: I think it depends by the role of sample weight during the construction of the forest. If I set sample_weight = 2 for one of my samples is this equivalent to duplicate the row in the data? During fitting, yes, during evaluation currently not. On Fr

Re: [Scikit-learn-general] sample weights for RandomForestClassifier to compute cross_val_score with roc_auc metric

2015-04-25 Thread Luca Puggini
I think it depends by the role of sample weight during the construction of the forest. If I set sample_weight = 2 for one of my samples is this equivalent to duplicate the row in the data? On Fri, Apr 24, 2015 at 10:25 PM, Andreas Mueller wrote: > The roc_auc will not take sample_weights into a

Re: [Scikit-learn-general] sample weights for RandomForestClassifier to compute cross_val_score with roc_auc metric

2015-04-24 Thread Andreas Mueller
The roc_auc will not take sample_weights into account if using cross_val_score. Thinking about it, I'm not sure if this a bug or a feature. Not sure if that was discussed before, I opened an issue: https://github.com/scikit-learn/scikit-learn/issues/4632 On 04/24/2015 12:29 PM, Luca Puggini wrot

[Scikit-learn-general] sample weights for RandomForestClassifier to compute cross_val_score with roc_auc metric

2015-04-24 Thread Luca Puggini
Dear all, I am quiet new to {0,1} classification problems. I have an unbalanced dataset and and I am using a RandomForestMethod on it. To evaluate the performances of my estimator I am using the cross_val_score function with the roc_auc metric. My understanding is that to deal with unbalanced p

Re: [Scikit-learn-general] Sample weights

2015-02-11 Thread Michael Eickenberg
On Wednesday, February 11, 2015, Michael Eickenberg < michael.eickenb...@gmail.com> wrote: > > > On Wednesday, February 11, 2015, Carlos Pita > wrote: > >> Hi all, >> >> I'm trying to port to sklearn some R code that does WLS and I noticed >> that the fit method for some classes will accept a sam

Re: [Scikit-learn-general] Sample weights

2015-02-11 Thread Michael Eickenberg
On Wednesday, February 11, 2015, Carlos Pita wrote: > Hi all, > > I'm trying to port to sklearn some R code that does WLS and I noticed > that the fit method for some classes will accept a sample_weight > parameter (v.g. SGDRegressor) while for other classes it won't (v.g. > LinearRegression). Is

[Scikit-learn-general] Sample weights

2015-02-11 Thread Carlos Pita
Hi all, I'm trying to port to sklearn some R code that does WLS and I noticed that the fit method for some classes will accept a sample_weight parameter (v.g. SGDRegressor) while for other classes it won't (v.g. LinearRegression). Is this just inconsistent or has it a rationale behind? Maybe in so