Hi,
  Thanks for your response. The class that I am using is 
org.apache.mahout.classifier.sgd.TrainLogistic
Each line in the input file is of the form
Targetvalue, predictor1value, predictor2value, .... predictor20value
e.g. lines
1, 1.4, 1.9, 2.3,0........1.0
0, 1.2,0,3.4,..............0.0
....
,...

This is the file (the first line has the headers) that I input into R and run 
the logistic regression and it is this same file that I use as input to Mahout
The commandline call is something like
Java .... org.apache.mahout.classifier.sgd.TrainLogistic --input 
<inputfilename> --output  <outputfilename> -- target <TargetVariablename> 
--categories 2 --predictors predictor1 predictor2 ..... --types numeric

Thanks
Prabhu

-----Original Message-----
From: Ted Dunning [mailto:[email protected]] 
Sent: 31 January 2013 01:32
To: [email protected]
Subject: Re: Logistic Regression in Mahout

What classes are you using and how are you using them?

How are you producing the training vectors?

On Wed, Jan 30, 2013 at 4:12 AM, Prabhu <[email protected]> wrote:

> Hi all,
>
>     I am trying to use Mahout to run logistic regression analysis on 
> some data. The data is about 7 Million rows, with about 20 predictor 
> variables (all of them numeric).  The target variable is Boolean - 0 or 1.
>
> I run a logistic regression with this data on R and I get good 
> co-efficients which makes sense. But when I  run a logistic regression 
> on the exact same data using Mahout, I get co-efficients that don't 
> make sense. For a start, all co-efficients are negative. The 
> interesting thing is that the co-efficient (from R) for the most 
> important variable (with highest
> co-efficient) has the least negative value in Mahout. Can someone 
> please help me understand what the cause of the problem is?
>
>
>
> Thanks
>
> Prabhu
>
>
>
>

Reply via email to