[ 
https://issues.apache.org/jira/browse/MAHOUT-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-964:
---------------------------------

    Description: 
1. If an invalid Similarity Measure has been specified as input to the 
RowSimilarityJob, it presently throws a ClassNotFoundException but still 
proceeds with executing all of the subsequent tasks - VectorNormalizer, 
Cooccurrences Mapper and UnSymmetrify Mapper. We should exit the process early 
without having to invoke all of the subsequent tasks (all of them fail anyways).

2. It would be nice to have an --overwrite option for the Command line 
interface which would delete the temp and output paths at the beginning of 
RowSimilarityJob execution, similar to what's being done in seq2sparse, 
seqdirectory. If I run RowSimilarityJob over and over again with different 
similarity measures, I should not be forced to delete my temp and output paths 
first prior to invoking the job.

  was:
1. If an invalid Similarity Measure has been specified as input to the 
RowSimilarityJob, it presently throws a ClassCastException but still proceeds 
with executing all of the subsequent tasks - VectorNormalizer, Cooccurrences 
Mapper and UnSymmetrify Mapper. We should exit the process early without having 
to invoke all of the subsequent tasks (all of them fail anyways).

2. It would be nice to have an --overwrite option for the Command line 
interface which would delete the temp and output paths at the beginning of 
RowSimilarityJob execution, similar to what's being done in seq2sparse, 
seqdirectory. If I run RowSimilarityJob over and over again with different 
similarity measures, I should not be forced to delete my temp and output paths 
first prior to invoking the job.

    
> RowSimilarityJob should exit immediately if an invalid similarity measure 
> specified and it would be nice to have an --overwrite option for the 
> RowSimilarityJob CLI
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-964
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-964
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>    Affects Versions: 0.6
>         Environment: Mahout 0.6 snapshot from trunk
>            Reporter: Suneel Marthi
>         Attachments: Mahout-964.patch
>
>
> 1. If an invalid Similarity Measure has been specified as input to the 
> RowSimilarityJob, it presently throws a ClassNotFoundException but still 
> proceeds with executing all of the subsequent tasks - VectorNormalizer, 
> Cooccurrences Mapper and UnSymmetrify Mapper. We should exit the process 
> early without having to invoke all of the subsequent tasks (all of them fail 
> anyways).
> 2. It would be nice to have an --overwrite option for the Command line 
> interface which would delete the temp and output paths at the beginning of 
> RowSimilarityJob execution, similar to what's being done in seq2sparse, 
> seqdirectory. If I run RowSimilarityJob over and over again with different 
> similarity measures, I should not be forced to delete my temp and output 
> paths first prior to invoking the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to