Ok, i was thinking i could easily use the ModelDissector class because requires an AbstractVectorClassifier and the StandardNaiveBayesClassifier in the naivebayes package extends that class.
On 14 December 2011 14:42, Ted Dunning <[email protected]> wrote: > > I think that using the model dissector with NaiveBayes will not work > easily. The assumption inside the model dissector is that there is a model > matrix compatible with logistic regression to be had. > > The easy way to get everything to work is to simply use a single > categorical variable that can have four values. Pretend this variable is > text. If you use hashed vector encoding, you should be able to continue, > but you really need to use StaticWordEncoder (name is approximate). > > Also, with a tiny example, NB will give unreasonably pessimistic results. > > On Wed, Dec 14, 2011 at 6:01 AM, Daniele Volpi <[email protected]>wrote: > > > The version is 0.6-SNAPSHOT > > From terminal both commands trainclassifier and testclassifier work. > > Actually my real purpose is to use the TrainNaiveBayesJob in order to > > obtain a StandardNaiveBayesClassifier that i can use with the > > ModelDissector class similiar to chapter 15 in Mahout In Action, maybe the > > procedure is completely wrong. > > Thank you > > > > > > On 14 December 2011 01:24, Ted Dunning <[email protected]> wrote: > > > > > Which version of Mahout? > > > > > > And what happens when you train the classifier from the command line? > > > > > > On Tue, Dec 13, 2011 at 2:27 PM, Daniele Volpi <[email protected] > > > >wrote: > > > > > > > First of all i've converted the train files in the format: > > > > target[\t]terms > > > > through the BayesFileFormatter class. > > > > Then i've converted these files (one per category) in SequenceFile > > using > > > > the seqdirectory program. > > > > After that I ran this code: > > > > > > > > TrainNaiveBayesJob trainer = new TrainNaiveBayesJob(); > > > > trainer.setConf(new Configuration()); > > > > > > > > String[] params = {"-i" + inputPath, "-o" + outputPath, "-ow", "-el"}; > > > > trainer.run(params); > > > > > > > > Here's the error message: > > > > > > > > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast > > to > > > > org.apache.mahout.math.VectorWritable > > > > at > > > > > > > > > > > > > org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapper.map(IndexInstancesMapper.java:1) > > > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > > > > at > > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > > > > > > > > On 13 December 2011 19:52, Grant Ingersoll <[email protected]> > > wrote: > > > > > > > > > What steps have you done? > > > > > > > > > > On Dec 13, 2011, at 12:29 PM, Daniele Volpi wrote: > > > > > > > > > > > Hi everyone, > > > > > > I'm trying to implement the Naive Bayes classifier through the > > > > > > TrainNaiveBayesJob class. > > > > > > After convert the text files in the required sequencefile for the > > > "run" > > > > > > method through the seqdirectory program i get this error: > > > > > > > > > > > > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be > > > cast > > > > to > > > > > > org.apache.mahout.math.VectorWritable > > > > > > > > > > > > Do you have some hints on the right usage of this class? > > > > > > > > > > > > Thanks, > > > > > > Daniele Volpi > > > > > > > > > > -------------------------------------------- > > > > > Grant Ingersoll > > > > > http://www.lucidimagination.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
