I just realize that there is mismatch between the trainclassifier command versus TrainClassifier.java program.
Here is why: 1. I run trainclassifier program with source:hdfs. So, I provide input and output locations on my hdfs. Using training files, the program generates the model and puts it on my hdfs, as expected. When I provide input and output locations on my local fs while the source is set to hdfs, it complains and gives me "Input path does not exist" error, which is understandable. 2. However, when I run TrainClassifier.java program with source set to hdfs and input and output set to locations on my local fs, it accepts the arguments with no complaints and generates the model on my local fs (instead of hdfs). In addition, the models generated by these two programs are slightly different as far as the log files go. Is this a known case, or am I missing something? -- View this message in context: http://lucene.472066.n3.nabble.com/trainclassifier-as-a-command-vs-TrainClassifier-java-tp3508652p3508652.html Sent from the Mahout User List mailing list archive at Nabble.com.
