Just to make sure I'm understanding, the docs for "clean SVD" at 
https://cwiki.apache.org/confluence/display/MAHOUT/Dimensional+Reduction are 
not correct, right?

In looking at the code, the SVD command requires --Dmapred.input.dir (soon to 
be --input like everything else, see MAHOUT-461) a --tempDir and 
--Dmapred.output.dir (soon to be --output).  Then, in the cleansvd command, the 
--eigenInput should actually refer to the Output directory not the tempDir as 
the docs suggest, right?

Also, any recommendations on setting maxError and minEigenValue?  What are the 
tradeoffs I'm making there?  I mean, I suppose maxError is some measure of 
convergence and minEigenValue is just as it sounds, but what are the practical 
implications of those settings?  Are the values in the example good starting 
points?

Thanks,
Grant

Reply via email to