Ted Dunning <ted.dunning <at> gmail.com> writes:

> 
> This is pretty confusing.
> 
> What has happened is that you have encoded a single categorical variable
> that has four states as four numerical variables.  Unfortunately, Mahout
> has gotten the message that you are using four categorical variables each
> with two states.
> 
> Can you say a bit about how you ran the command?
> 
> Also, take a look at chapters after 13 for more examples that don't just
> use numerical attributes.
> 
> On Tue, Dec 13, 2011 at 3:30 PM, magicalo <magica1980 <at> yahoo.com> wrote:
> 
> > Hello,
> >
> > I am trying to understand the output of TrainLogic. Mahout in Action only
> > has an
> > example of running the classifier on numeric predictor variables. However,
> > my
> > model uses categorical predictor variables only (a,b,c,d) and each can
> > only hold
> > a value of 0 or 1 only. The output I get is this:
> >
> > myslassifier ~ -63.081*a=0 + -15.640*a=1 +
> >               -47.268*b=0 + -31.457*b=1 +
> >               -47.269*c=0 + -23.541*c=1 +
> >               -23.635*d=0 + -15.729*d=1 + .... +
> >               -23.635*Intercept Term
> >             a=0 -63.08073
> >             a=1 -15.63973
> >             b=0 -47.26775
> >             b=1 -31.45702
> >             c=0 -47.26934
> >             c=1 -23.54059
> >             d=0 -23.63467
> >             d=1 -15.72897
> >
> > How should I interpret this? Why do I have 2 weights for the same predictor
> > variable (e.g.: a=0 and a=1) and why are all of the weights negative? I
> > have not
> > been able to find any documentation on this. Thanks!
> >
> > Thanks a lot!
> >
> >
> 

Thanks for the response, Ted.
However, I am still not seeing what I am doing wrong here.

This is what my file looks like:
"myslassifier","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o",
"p","q","r","s","t","u","v","w","x","y","z","aa","bb","cc","dd","ee","ff",
"gg","hh","ii","jj","kk","ll","mm","nn","oo","pp","qq","rr","ss","tt","uu",
"vv"
1,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,
0,0,0,0,0,1,1,0,0,0,1
1,0,1,0,1,0,0,1,0,1,0,0,1,1,0,1,1,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,
0,0,0,0,0,0,1,0,0,0,0
1,0,0,0,0,0,0,1,0,0,0,0,1,1,0,1,0,1,0,1,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,0,0,0,0,0,1,0,0,0,0
…
0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,1,1,0,1,1,0,1,0,1,0,0,
0,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0


And this is how I run it:

mahout org.apache.mahout.classifier.sgd.TrainLogistic \
> --input /tmp/train.csv \
> --output /tmp/my.model \
> --target myslassifier --categories 2 --types word --features 50 \
> --passes 2 \
> --rate 50 \
> --predictors a b c d e f g h i j k l m n o p q r s t u v w x y z aa bb \
> cc dd ee ff gg hh ii jj kk ll mm nn oo pp qq rr ss tt uu vv


Thanks



Reply via email to