Does the new approach do the same thing as the old approach? On Thu, Dec 15, 2011 at 1:56 AM, Daniele Volpi <[email protected]> wrote: > Yes Grant that was the point of my first question.. > Now I'll take a look at the vector implementation. > Thanks again > Daniele > > On 14 December 2011 23:44, Grant Ingersoll <[email protected]> wrote: >> While Ted answered the Dissector question, your original issue, I believe, >> is that Mahout currently has two different NB implementations. >> trainclassifier/testclassifier use the old, word based package which >> requires Text as input. The new package, which TrainNaiveBayesJob uses, >> requires VectorWritables. For the latter case, you don't use the >> BayesFileFormatter at all. See the asf-email-examples for how to use the >> Vector based approach. I realize this is confusing, but we haven't yet made >> the transition fully to the new vector based approach. >> >> -Grant >> >> On Dec 14, 2011, at 3:01 AM, Daniele Volpi wrote: >> >>> The version is 0.6-SNAPSHOT >>> From terminal both commands trainclassifier and testclassifier work. >>> Actually my real purpose is to use the TrainNaiveBayesJob in order to >>> obtain a StandardNaiveBayesClassifier that i can use with the >>> ModelDissector class similiar to chapter 15 in Mahout In Action, maybe the >>> procedure is completely wrong. >>> Thank you >>> >>> >>> On 14 December 2011 01:24, Ted Dunning <[email protected]> wrote: >>> >>>> Which version of Mahout? >>>> >>>> And what happens when you train the classifier from the command line? >>>> >>>> On Tue, Dec 13, 2011 at 2:27 PM, Daniele Volpi <[email protected] >>>>> wrote: >>>> >>>>> First of all i've converted the train files in the format: >>>>> target[\t]terms >>>>> through the BayesFileFormatter class. >>>>> Then i've converted these files (one per category) in SequenceFile using >>>>> the seqdirectory program. >>>>> After that I ran this code: >>>>> >>>>> TrainNaiveBayesJob trainer = new TrainNaiveBayesJob(); >>>>> trainer.setConf(new Configuration()); >>>>> >>>>> String[] params = {"-i" + inputPath, "-o" + outputPath, "-ow", "-el"}; >>>>> trainer.run(params); >>>>> >>>>> Here's the error message: >>>>> >>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to >>>>> org.apache.mahout.math.VectorWritable >>>>> at >>>>> >>>>> >>>> org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapper.map(IndexInstancesMapper.java:1) >>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >>>>> at >>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) >>>>> >>>>> On 13 December 2011 19:52, Grant Ingersoll <[email protected]> wrote: >>>>> >>>>>> What steps have you done? >>>>>> >>>>>> On Dec 13, 2011, at 12:29 PM, Daniele Volpi wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> I'm trying to implement the Naive Bayes classifier through the >>>>>>> TrainNaiveBayesJob class. >>>>>>> After convert the text files in the required sequencefile for the >>>> "run" >>>>>>> method through the seqdirectory program i get this error: >>>>>>> >>>>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be >>>> cast >>>>> to >>>>>>> org.apache.mahout.math.VectorWritable >>>>>>> >>>>>>> Do you have some hints on the right usage of this class? >>>>>>> >>>>>>> Thanks, >>>>>>> Daniele Volpi >>>>>> >>>>>> -------------------------------------------- >>>>>> Grant Ingersoll >>>>>> http://www.lucidimagination.com >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >> >> -------------------------------------------- >> Grant Ingersoll >> http://www.lucidimagination.com >> >> >>
-- Lance Norskog [email protected]
