Yes Grant that was the point of my first question.. Now I'll take a look at the vector implementation. Thanks again Daniele
On 14 December 2011 23:44, Grant Ingersoll <[email protected]> wrote: > While Ted answered the Dissector question, your original issue, I believe, is > that Mahout currently has two different NB implementations. > trainclassifier/testclassifier use the old, word based package which requires > Text as input. The new package, which TrainNaiveBayesJob uses, requires > VectorWritables. For the latter case, you don't use the BayesFileFormatter > at all. See the asf-email-examples for how to use the Vector based approach. > I realize this is confusing, but we haven't yet made the transition fully to > the new vector based approach. > > -Grant > > On Dec 14, 2011, at 3:01 AM, Daniele Volpi wrote: > >> The version is 0.6-SNAPSHOT >> From terminal both commands trainclassifier and testclassifier work. >> Actually my real purpose is to use the TrainNaiveBayesJob in order to >> obtain a StandardNaiveBayesClassifier that i can use with the >> ModelDissector class similiar to chapter 15 in Mahout In Action, maybe the >> procedure is completely wrong. >> Thank you >> >> >> On 14 December 2011 01:24, Ted Dunning <[email protected]> wrote: >> >>> Which version of Mahout? >>> >>> And what happens when you train the classifier from the command line? >>> >>> On Tue, Dec 13, 2011 at 2:27 PM, Daniele Volpi <[email protected] >>>> wrote: >>> >>>> First of all i've converted the train files in the format: >>>> target[\t]terms >>>> through the BayesFileFormatter class. >>>> Then i've converted these files (one per category) in SequenceFile using >>>> the seqdirectory program. >>>> After that I ran this code: >>>> >>>> TrainNaiveBayesJob trainer = new TrainNaiveBayesJob(); >>>> trainer.setConf(new Configuration()); >>>> >>>> String[] params = {"-i" + inputPath, "-o" + outputPath, "-ow", "-el"}; >>>> trainer.run(params); >>>> >>>> Here's the error message: >>>> >>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to >>>> org.apache.mahout.math.VectorWritable >>>> at >>>> >>>> >>> org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapper.map(IndexInstancesMapper.java:1) >>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >>>> at >>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) >>>> >>>> On 13 December 2011 19:52, Grant Ingersoll <[email protected]> wrote: >>>> >>>>> What steps have you done? >>>>> >>>>> On Dec 13, 2011, at 12:29 PM, Daniele Volpi wrote: >>>>> >>>>>> Hi everyone, >>>>>> I'm trying to implement the Naive Bayes classifier through the >>>>>> TrainNaiveBayesJob class. >>>>>> After convert the text files in the required sequencefile for the >>> "run" >>>>>> method through the seqdirectory program i get this error: >>>>>> >>>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be >>> cast >>>> to >>>>>> org.apache.mahout.math.VectorWritable >>>>>> >>>>>> Do you have some hints on the right usage of this class? >>>>>> >>>>>> Thanks, >>>>>> Daniele Volpi >>>>> >>>>> -------------------------------------------- >>>>> Grant Ingersoll >>>>> http://www.lucidimagination.com >>>>> >>>>> >>>>> >>>>> >>>> >>> > > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > >
