Hi all, I recently ran mahout's svd on a large text corpus following the helpful example written here: https://cwiki.apache.org/MAHOUT/dimensionalreduction.html
Just a few questions about how I should best interpret the output: * I chose to calculate 200 singular vectors - as the driver was finishing up it printed out the eigenvalues and I was surprised to see them in ascending order. The first singular vector had an eigenvalue of zero, there was an elbow at ~dimension 180, and a sharp incline towards an eigenvalue of 1.0 at dimension 199. I was expecting these to be in declining order. Did I do something wrong? * Usually when choosing the number of dimensions I'd chop off at the elbow, but cleansvd seems to have a number of more specific options. Assuming my first run has gone correctly, are there rules of thumb I should follow for picking the min eigenvalue and max error? Thanks, Erik
