Hi all,

I recently ran mahout's svd on a large text corpus following the helpful 
example written here: https://cwiki.apache.org/MAHOUT/dimensionalreduction.html

Just a few questions about how I should best interpret the output:

* I chose to calculate 200 singular vectors - as the driver was finishing up it 
printed out the eigenvalues and I was surprised to see them in ascending order. 
 The first singular vector had an eigenvalue of zero, there was an elbow at 
~dimension 180, and a sharp incline towards an eigenvalue of 1.0 at dimension 
199.  I was expecting these to be in declining order.  Did I do something wrong?

* Usually when choosing the number of dimensions I'd chop off at the elbow, but 
cleansvd seems to have a number of more specific options.  Assuming my first 
run has gone correctly, are there rules of thumb I should follow for picking 
the min eigenvalue and max error?

Thanks,

Erik

Reply via email to