Should compress output of Job pairwiseSimilarity and Job asMatrix
-----------------------------------------------------------------
Key: MAHOUT-474
URL: https://issues.apache.org/jira/browse/MAHOUT-474
Project: Mahout
Issue Type: Improvement
Reporter: Han Hui Wen
!https://issues.apache.org/jira/secure/thumbnail/12451985/12451985_RowSimilarityJob-CooccurrencesMapper-SimilarityReducer.jpg!
From above picture ,we can see that the output of pairwiseSimilarity is very
large ,we should compress them.
SequenceFileOutputFormat.setOutputCompressionType(job, style);
SequenceFileOutputFormat.setCompressOutput(job, compress);
SequenceFileOutputFormat.setOutputCompressorClass(job, codecClass)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.