RowSimilarityJob should exit immediately if an invalid similarity measure
specified and it would be nice to have an --overwrite option for the
RowSimilarityJob CLI
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key: MAHOUT-964
URL: https://issues.apache.org/jira/browse/MAHOUT-964
Project: Mahout
Issue Type: Improvement
Components: Math
Affects Versions: 0.6
Environment: Mahout 0.6 snapshot from trunk
Reporter: Suneel Marthi
1. If an invalid Similarity Measure has been specified as input to the
RowSimilarityJob, it presently throws a ClassCastException but still proceeds
with executing all of the subsequent tasks - VectorNormalizer, Cooccurrences
Mapper and UnSymmetrify Mapper. We should exit the process early without having
to invoke all of the subsequent tasks (all of them fail anyways).
2. It would be nice to have an --overwrite option for the Command line
interface which would delete the temp and output paths at the beginning of
RowSimilarityJob execution, similar to what's being done in seq2sparse,
seqdirectory. If I run RowSimilarityJob over and over again with different
similarity measures, I should not be forced to delete my temp and output paths
first prior to invoking the job.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira