I just realize that there is mismatch between the trainclassifier command versus
TrainClassifier.java program.

Here is why:

1. I run trainclassifier program with source:hdfs. So, I provide input and
output locations on my hdfs. Using training files, the program generates the
model and puts it on my hdfs, as expected. When I provide input and output
locations on my local fs while the source is set to hdfs, it complains and gives
me "Input path does not exist" error, which is understandable.

2. However, when I run TrainClassifier.java program with source set to hdfs and
input and output set to locations on my local fs, it accepts the arguments with
no complaints and generates the model on my local fs (instead of hdfs).

In addition, the models generated by these two programs are slightly different
as far as the log files go.

Is this a known case, or am I missing something?

Reply via email to