Sorry for the basic question. I've been reading about this for a few hours, but I'm still confused. I want to use ssvd to reduce the dimensionality of some tfidf-vectors so I can perform clustering on the result.
Among many other things, I've read: https://cwiki.apache.org/MAHOUT/dimensional-reduction.html Which states the process for svd is: bin/mahout svd (original -> svdOut) bin/mahout cleansvd ... bin/mahout transpose svdOut -> svdT bin/mahout transpose original -> originalT bin/mahout matrixmult originalT svdT -> newMatrix bin/mahout kmeans newMatrix I know you don't need to do cleansvd with ssvd output. My main question is which of the three outputs of ssvd should I be transposing and multiplying with the original tfidf-matrix? I'm having trouble understanding the math that's going on. ssvd outputs U, V, and sigma, and despite reading a bunch, I'm still confused on which of these outputs I should be using, and how. Could anyone spell it out for me? Thanks for any help, Matt
