Does the new approach do the same thing as the old approach?

On Thu, Dec 15, 2011 at 1:56 AM, Daniele Volpi <[email protected]> wrote:
> Yes Grant that was the point of my first question..
> Now I'll take a look at the vector implementation.
> Thanks again
> Daniele
>
> On 14 December 2011 23:44, Grant Ingersoll <[email protected]> wrote:
>> While Ted answered the Dissector question, your original issue, I believe, 
>> is that Mahout currently has two different NB implementations.  
>> trainclassifier/testclassifier use the old, word based package which 
>> requires Text as input.  The new package, which TrainNaiveBayesJob uses, 
>> requires VectorWritables.  For the latter case, you don't use the 
>> BayesFileFormatter at all.  See the asf-email-examples for how to use the 
>> Vector based approach.  I realize this is confusing, but we haven't yet made 
>> the transition fully to the new vector based approach.
>>
>> -Grant
>>
>> On Dec 14, 2011, at 3:01 AM, Daniele Volpi wrote:
>>
>>> The version is 0.6-SNAPSHOT
>>> From terminal both commands trainclassifier and testclassifier work.
>>> Actually my real purpose is to use the TrainNaiveBayesJob in order to
>>> obtain a StandardNaiveBayesClassifier that i can use with the
>>> ModelDissector class similiar to chapter 15 in Mahout In Action, maybe the
>>> procedure is completely wrong.
>>> Thank you
>>>
>>>
>>> On 14 December 2011 01:24, Ted Dunning <[email protected]> wrote:
>>>
>>>> Which version of Mahout?
>>>>
>>>> And what happens when you train the classifier from the command line?
>>>>
>>>> On Tue, Dec 13, 2011 at 2:27 PM, Daniele Volpi <[email protected]
>>>>> wrote:
>>>>
>>>>> First of all i've converted the train files in the format:
>>>>> target[\t]terms
>>>>> through the BayesFileFormatter class.
>>>>> Then i've converted these files (one per category) in SequenceFile using
>>>>> the seqdirectory program.
>>>>> After that I ran this code:
>>>>>
>>>>> TrainNaiveBayesJob trainer = new TrainNaiveBayesJob();
>>>>> trainer.setConf(new Configuration());
>>>>>
>>>>> String[] params = {"-i" + inputPath, "-o" + outputPath, "-ow", "-el"};
>>>>> trainer.run(params);
>>>>>
>>>>> Here's the error message:
>>>>>
>>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
>>>>> org.apache.mahout.math.VectorWritable
>>>>> at
>>>>>
>>>>>
>>>> org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapper.map(IndexInstancesMapper.java:1)
>>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>> at
>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>>>>>
>>>>> On 13 December 2011 19:52, Grant Ingersoll <[email protected]> wrote:
>>>>>
>>>>>> What steps have you done?
>>>>>>
>>>>>> On Dec 13, 2011, at 12:29 PM, Daniele Volpi wrote:
>>>>>>
>>>>>>> Hi everyone,
>>>>>>> I'm trying to implement the Naive Bayes classifier through the
>>>>>>> TrainNaiveBayesJob class.
>>>>>>> After convert the text files in the required sequencefile for the
>>>> "run"
>>>>>>> method through the seqdirectory program i get this error:
>>>>>>>
>>>>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be
>>>> cast
>>>>> to
>>>>>>> org.apache.mahout.math.VectorWritable
>>>>>>>
>>>>>>> Do you have some hints on the right usage of this class?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Daniele Volpi
>>>>>>
>>>>>> --------------------------------------------
>>>>>> Grant Ingersoll
>>>>>> http://www.lucidimagination.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>> --------------------------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>>
>>



-- 
Lance Norskog
[email protected]

Reply via email to