Ted Dunning <ted.dunning <at> gmail.com> writes: > > This is pretty confusing. > > What has happened is that you have encoded a single categorical variable > that has four states as four numerical variables. Unfortunately, Mahout > has gotten the message that you are using four categorical variables each > with two states. > > Can you say a bit about how you ran the command? > > Also, take a look at chapters after 13 for more examples that don't just > use numerical attributes. > > On Tue, Dec 13, 2011 at 3:30 PM, magicalo <magica1980 <at> yahoo.com> wrote: > > > Hello, > > > > I am trying to understand the output of TrainLogic. Mahout in Action only > > has an > > example of running the classifier on numeric predictor variables. However, > > my > > model uses categorical predictor variables only (a,b,c,d) and each can > > only hold > > a value of 0 or 1 only. The output I get is this: > > > > myslassifier ~ -63.081*a=0 + -15.640*a=1 + > > -47.268*b=0 + -31.457*b=1 + > > -47.269*c=0 + -23.541*c=1 + > > -23.635*d=0 + -15.729*d=1 + .... + > > -23.635*Intercept Term > > a=0 -63.08073 > > a=1 -15.63973 > > b=0 -47.26775 > > b=1 -31.45702 > > c=0 -47.26934 > > c=1 -23.54059 > > d=0 -23.63467 > > d=1 -15.72897 > > > > How should I interpret this? Why do I have 2 weights for the same predictor > > variable (e.g.: a=0 and a=1) and why are all of the weights negative? I > > have not > > been able to find any documentation on this. Thanks! > > > > Thanks a lot! > > > > >
Thanks for the response, Ted. However, I am still not seeing what I am doing wrong here. This is what my file looks like: "myslassifier","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o", "p","q","r","s","t","u","v","w","x","y","z","aa","bb","cc","dd","ee","ff", "gg","hh","ii","jj","kk","ll","mm","nn","oo","pp","qq","rr","ss","tt","uu", "vv" 1,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0, 0,0,0,0,0,1,1,0,0,0,1 1,0,1,0,1,0,0,1,0,1,0,0,1,1,0,1,1,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0, 0,0,0,0,0,0,1,0,0,0,0 1,0,0,0,0,0,0,1,0,0,0,0,1,1,0,1,0,1,0,1,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,0,0,0,0,0,1,0,0,0,0 … 0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,1,1,0,1,1,0,1,0,1,0,0, 0,0,0,0,0,0,0,0,0,0,0 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0 And this is how I run it: mahout org.apache.mahout.classifier.sgd.TrainLogistic \ > --input /tmp/train.csv \ > --output /tmp/my.model \ > --target myslassifier --categories 2 --types word --features 50 \ > --passes 2 \ > --rate 50 \ > --predictors a b c d e f g h i j k l m n o p q r s t u v w x y z aa bb \ > cc dd ee ff gg hh ii jj kk ll mm nn oo pp qq rr ss tt uu vv Thanks
