Also, cleansvd appears to be spewing a bunch of numbers (something about 
largestCleanEigens) to the log.  It's almost completely unreadable.  Any 
objection to me making it debug at a minimum?  Or, can it be removed?

On Aug 8, 2010, at 3:35 PM, Grant Ingersoll wrote:

> Just to make sure I'm understanding, the docs for "clean SVD" at 
> https://cwiki.apache.org/confluence/display/MAHOUT/Dimensional+Reduction are 
> not correct, right?
> 
> In looking at the code, the SVD command requires --Dmapred.input.dir (soon to 
> be --input like everything else, see MAHOUT-461) a --tempDir and 
> --Dmapred.output.dir (soon to be --output).  Then, in the cleansvd command, 
> the --eigenInput should actually refer to the Output directory not the 
> tempDir as the docs suggest, right?
> 
> Also, any recommendations on setting maxError and minEigenValue?  What are 
> the tradeoffs I'm making there?  I mean, I suppose maxError is some measure 
> of convergence and minEigenValue is just as it sounds, but what are the 
> practical implications of those settings?  Are the values in the example good 
> starting points?
> 
> Thanks,
> Grant


Reply via email to