While Ted answered the Dissector question, your original issue, I believe, is 
that Mahout currently has two different NB implementations.  
trainclassifier/testclassifier use the old, word based package which requires 
Text as input.  The new package, which TrainNaiveBayesJob uses, requires 
VectorWritables.  For the latter case, you don't use the BayesFileFormatter at 
all.  See the asf-email-examples for how to use the Vector based approach.  I 
realize this is confusing, but we haven't yet made the transition fully to the 
new vector based approach.

-Grant

On Dec 14, 2011, at 3:01 AM, Daniele Volpi wrote:

> The version is 0.6-SNAPSHOT
> From terminal both commands trainclassifier and testclassifier work.
> Actually my real purpose is to use the TrainNaiveBayesJob in order to
> obtain a StandardNaiveBayesClassifier that i can use with the
> ModelDissector class similiar to chapter 15 in Mahout In Action, maybe the
> procedure is completely wrong.
> Thank you
> 
> 
> On 14 December 2011 01:24, Ted Dunning <[email protected]> wrote:
> 
>> Which version of Mahout?
>> 
>> And what happens when you train the classifier from the command line?
>> 
>> On Tue, Dec 13, 2011 at 2:27 PM, Daniele Volpi <[email protected]
>>> wrote:
>> 
>>> First of all i've converted the train files in the format:
>>> target[\t]terms
>>> through the BayesFileFormatter class.
>>> Then i've converted these files (one per category) in SequenceFile using
>>> the seqdirectory program.
>>> After that I ran this code:
>>> 
>>> TrainNaiveBayesJob trainer = new TrainNaiveBayesJob();
>>> trainer.setConf(new Configuration());
>>> 
>>> String[] params = {"-i" + inputPath, "-o" + outputPath, "-ow", "-el"};
>>> trainer.run(params);
>>> 
>>> Here's the error message:
>>> 
>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
>>> org.apache.mahout.math.VectorWritable
>>> at
>>> 
>>> 
>> org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapper.map(IndexInstancesMapper.java:1)
>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>>> 
>>> On 13 December 2011 19:52, Grant Ingersoll <[email protected]> wrote:
>>> 
>>>> What steps have you done?
>>>> 
>>>> On Dec 13, 2011, at 12:29 PM, Daniele Volpi wrote:
>>>> 
>>>>> Hi everyone,
>>>>> I'm trying to implement the Naive Bayes classifier through the
>>>>> TrainNaiveBayesJob class.
>>>>> After convert the text files in the required sequencefile for the
>> "run"
>>>>> method through the seqdirectory program i get this error:
>>>>> 
>>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be
>> cast
>>> to
>>>>> org.apache.mahout.math.VectorWritable
>>>>> 
>>>>> Do you have some hints on the right usage of this class?
>>>>> 
>>>>> Thanks,
>>>>> Daniele Volpi
>>>> 
>>>> --------------------------------------------
>>>> Grant Ingersoll
>>>> http://www.lucidimagination.com
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com



Reply via email to