I think that using the model dissector with NaiveBayes will not work easily. The assumption inside the model dissector is that there is a model matrix compatible with logistic regression to be had.
The easy way to get everything to work is to simply use a single categorical variable that can have four values. Pretend this variable is text. If you use hashed vector encoding, you should be able to continue, but you really need to use StaticWordEncoder (name is approximate). Also, with a tiny example, NB will give unreasonably pessimistic results. On Wed, Dec 14, 2011 at 6:01 AM, Daniele Volpi <[email protected]>wrote: > The version is 0.6-SNAPSHOT > From terminal both commands trainclassifier and testclassifier work. > Actually my real purpose is to use the TrainNaiveBayesJob in order to > obtain a StandardNaiveBayesClassifier that i can use with the > ModelDissector class similiar to chapter 15 in Mahout In Action, maybe the > procedure is completely wrong. > Thank you > > > On 14 December 2011 01:24, Ted Dunning <[email protected]> wrote: > > > Which version of Mahout? > > > > And what happens when you train the classifier from the command line? > > > > On Tue, Dec 13, 2011 at 2:27 PM, Daniele Volpi <[email protected] > > >wrote: > > > > > First of all i've converted the train files in the format: > > > target[\t]terms > > > through the BayesFileFormatter class. > > > Then i've converted these files (one per category) in SequenceFile > using > > > the seqdirectory program. > > > After that I ran this code: > > > > > > TrainNaiveBayesJob trainer = new TrainNaiveBayesJob(); > > > trainer.setConf(new Configuration()); > > > > > > String[] params = {"-i" + inputPath, "-o" + outputPath, "-ow", "-el"}; > > > trainer.run(params); > > > > > > Here's the error message: > > > > > > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast > to > > > org.apache.mahout.math.VectorWritable > > > at > > > > > > > > > org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapper.map(IndexInstancesMapper.java:1) > > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > > > at > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > > > > > > On 13 December 2011 19:52, Grant Ingersoll <[email protected]> > wrote: > > > > > > > What steps have you done? > > > > > > > > On Dec 13, 2011, at 12:29 PM, Daniele Volpi wrote: > > > > > > > > > Hi everyone, > > > > > I'm trying to implement the Naive Bayes classifier through the > > > > > TrainNaiveBayesJob class. > > > > > After convert the text files in the required sequencefile for the > > "run" > > > > > method through the seqdirectory program i get this error: > > > > > > > > > > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be > > cast > > > to > > > > > org.apache.mahout.math.VectorWritable > > > > > > > > > > Do you have some hints on the right usage of this class? > > > > > > > > > > Thanks, > > > > > Daniele Volpi > > > > > > > > -------------------------------------------- > > > > Grant Ingersoll > > > > http://www.lucidimagination.com > > > > > > > > > > > > > > > > > > > > > >
