[
https://issues.apache.org/jira/browse/MAHOUT-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe Prasanna Kumar updated MAHOUT-509:
--------------------------------------
Attachment: MAHOUT-509-fix-TestClassifier.patch
I have fixed TestClassifier with making the parameters ngram, classifiertype
and datasource as optional. Verified that this works. So to test a classifier
(say with wikipedia example), the command would be {code}
$MAHOUT_HOME/bin/mahout testclassifier -m wikipediamodel -d wikipediainput
-method mapreduce {code}
> Options in Bayes TrainClassifier and TestClassifier
> ---------------------------------------------------
>
> Key: MAHOUT-509
> URL: https://issues.apache.org/jira/browse/MAHOUT-509
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Reporter: Joe Prasanna Kumar
> Assignee: Robin Anil
> Priority: Minor
> Fix For: 0.4
>
> Attachments: MAHOUT-509-fix-TestClassifier.patch, MAHOUT-509.patch,
> MAHOUT-509_1.patch
>
>
> Hi all,
> As I was going through wikipedia example, I encountered a situation with
> TrainClassifier wherein some of the options with default values are actually
> mandatory.
> The documentation / command line help says that
> default source (--datasource) is hdfs but TrainClassifier has
> withRequired(true) while building the --datasource option. We are checking if
> the dataSourceType is hbase else set it to hdfs. so ideally withRequired
> should be set to false
> default --classifierType is bayes but withRequired is set to true and we have
> code like
> if ("bayes".equalsIgnoreCase(classifierType)) {
> log.info("Training Bayes Classifier");
> trainNaiveBayes(inputPath, outputPath, params);
>
> } else if ("cbayes".equalsIgnoreCase(classifierType)) {
> log.info("Training Complementary Bayes Classifier");
> // setup the HDFS and copy the files there, then run the trainer
> trainCNaiveBayes(inputPath, outputPath, params);
> }
> which should be changed to
> if ("cbayes".equalsIgnoreCase(classifierType)) {
> log.info("Training Complementary Bayes Classifier");
> trainCNaiveBayes(inputPath, outputPath, params);
>
> } else {
> log.info("Training Bayes Classifier");
> // setup the HDFS and copy the files there, then run the trainer
> trainNaiveBayes(inputPath, outputPath, params);
> }
> Please let me know if this looks valid and I'll submit a patch for a JIRA
> issue.
> reg
> Joe.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.