I'm a little confused as to the proper way to format the data for training a
naive bayes classifier. Is it possible to give the classifier tfidf-vectors
generated using the results from seq2sparse?  I have arranged it so that I
have a sequence file where the key is the target variable and the value is a
tfidf vector. When I use this as the input to trainclassifier I get the
following error:

Running on hadoop, using HADOOP_HOME=/home/kevin/Hadoop/hadoop-0.20.2/
No HADOOP_CONF_DIR set, using /home/kevin/Hadoop/hadoop-0.20.2//src/conf 
11/07/12 09:27:13 WARN driver.MahoutDriver: No trainclassifier.props found
on classpath, will use command-line arguments only
11/07/12 09:27:13 INFO bayes.TrainClassifier: Training Bayes Classifier
11/07/12 09:27:13 INFO bayes.BayesDriver: Reading features...
11/07/12 09:27:13 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/07/12 09:27:14 INFO mapred.FileInputFormat: Total input paths to process
: 1
11/07/12 09:27:14 INFO mapred.JobClient: Running job: job_201107120921_0001
11/07/12 09:27:15 INFO mapred.JobClient:  map 0% reduce 0%
11/07/12 09:27:24 INFO mapred.JobClient: Task Id :
attempt_201107120921_0001_m_000000_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
        at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 5 more


Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-tf-idf-vectors-to-train-Naive-Bayes-tp3162590p3162590.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to