Down sampling the negatives only adjusts the intercept term in logistic
regression.  It shouldn't affect the model itself.

As far as threshold is concerned, just get the score yourself
(classifyScalar) and use your own threshold.

On Tue, Feb 21, 2012 at 7:09 PM, Sagar Sharma <[email protected]>wrote:

> But don’t you think that down sampling the negative outcomes would skew
> the model?
>
> By threshold, I mean the cut off value for classification. I think it is
> 0.5 by default. But I want to change it for my model.
>
>
> -----Original Message-----
> From: Ted Dunning [mailto:[email protected]]
> Sent: Tuesday, February 21, 2012 1:57 PM
> To: [email protected]
> Subject: Re: Regression Algorithm
>
> Bigger is always better.
>
> But you may be happier if you downsample the negative cases since they
> will be providing very little value in this model.
>
> Can you say what you mean by threshold?  There is no threshold in Mahout's
> logistic regression.
>
> On Tue, Feb 21, 2012 at 5:44 PM, Sagar Sharma <[email protected]>
> wrote:
>
> > Hello friends,
> >
> >
> >
> > I am trying to test and implement a binary logistic regression
> > algorithm for Click Through analysis for my website. The dependent
> > variable has two
> > outcomes: 1 and 0. But in my dataset the ratio of two outcome is
> > 1:1500 on an average, i.e. 1 positive outcome for every 1500 negative
> > outcome. I would like to know what should be the optimum size of
> > training dataset so that I can get best possible predicted
> > probabilities. Also, I would like to change the threshold value for
> logistic regression in mahout.
> >
> >
> >
> > Please help me if anyone has done a similar task before.
> >
> >
> >
> > Thanks,
> >
> >
> >
> > Sagar Sharma
> >
>
>

Reply via email to