Can you update the wiki? -Grant
On Sep 14, 2010, at 1:44 PM, Jeff Eastman wrote: > Here's the new set of mahout svd arguments. Entries --cleansvd, --maxError, > --minEigenvalue and --inMemory have been added in r997007. See the new tests > in TestDistributedLanczosSolverCLI for examples of both forms: > > --input (-i) input Path to job input directory. > --output (-o) output The directory pathname for output. > --numRows (-nr) numRows Number of rows of the input matrix > --numCols (-nc) numCols Number of columns of the input matrix > --rank (-r) rank Desired decomposition rank (note: > only roughly 1/4 to 1/3 of these will > have the top portion of the spectrum) > --symmetric (-sym) symmetric Is the input matrix square and > symmetric? > --cleansvd (-cl) cleansvd Run the EigenVerificationJob to clean > the eigenvectors after SVD > --maxError (-err) maxError Maximum acceptable error > --minEigenvalue (-mev) minEigenvalue Minimum eigenvalue to keep the vector > for > --inMemory (-mem) inMemory Buffer eigen matrix into memory (if > you have enough!) > --help (-h) Print out help > --tempDir tempDir Intermediate output directory > --startPhase startPhase First phase to run > --endPhase endPhase Last phase to run > > On 9/14/10 6:55 AM, Jake Mannix wrote: >> I guess the main thing I'd want to happen in combining EVJ and DLS is to >> make sure that the final output (changing the semantics of the CLI param is >> ok) is clear, with it either being the output of EVJ (if that is used), or >> DLS (if EVJ is not used). If that can be done, go for it! >> >> -jake >> >> On Tue, Sep 14, 2010 at 6:30 AM, Jeff >> Eastman<[email protected]>wrote: >> >>> Jake, I see you are on line. I'm inclined to push forward on this despite >>> the adjustments to DLS --output semantics. Agreed? >>> >>> >>> On 9/13/10 10:34 AM, Jeff Eastman wrote: >>> >>>> r996599 completed the first part. Several additional arguments to EVJ.run >>>> need to be added to DLS (maxError, minEigenValue, inMemory, also the >>>> --cleansvn flag itself). Also DLS interprets --output as the >>>> outputEigenVectorPath and not as the generic output directory so DLS.run() >>>> will need another argument too. Still want to do this? >>>> >>>> On 9/12/10 2:19 PM, Jake Mannix wrote: >>>> >>>>> +1 on folding EigenVerificationJob into DistributedLanczosSolver. Or, at >>>>>> least implement a job() method on EVJ. >>>>>> >>>>>> +1 for having the latter, with a boolean flag in DLS to optionally call >>>>> EJV >>>>> after it's done. >>>>> >>>> > -------------------------- Grant Ingersoll http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8
