Just to make sure I'm understanding, the docs for "clean SVD" at https://cwiki.apache.org/confluence/display/MAHOUT/Dimensional+Reduction are not correct, right?
In looking at the code, the SVD command requires --Dmapred.input.dir (soon to be --input like everything else, see MAHOUT-461) a --tempDir and --Dmapred.output.dir (soon to be --output). Then, in the cleansvd command, the --eigenInput should actually refer to the Output directory not the tempDir as the docs suggest, right? Also, any recommendations on setting maxError and minEigenValue? What are the tradeoffs I'm making there? I mean, I suppose maxError is some measure of convergence and minEigenValue is just as it sounds, but what are the practical implications of those settings? Are the values in the example good starting points? Thanks, Grant
