Github user rezazadeh commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13900614
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
---
@@ -220,16 +247,43 @@ class RowMatrix(
}
/**
- * Computes the singular value decomposition of this matrix.
- * Denote this matrix by A (m x n), this will compute matrices U, S, V
such that A = U * S * V'.
+ * Computes the singular value decomposition of this matrix, using
default tolerance (1e-9).
*
- * There is no restriction on m, but we require `n^2` doubles to fit in
memory.
- * Further, n should be less than m.
+ * @param k number of singular values to keep. We might return less than
k if there are
+ * numerically zero singular values. See rCond.
+ * @param computeU whether to compute U
+ * @param rCond the reciprocal condition number. All singular values
smaller than rCond * sigma(0)
+ * are treated as zero, where sigma(0) is the largest
singular value.
+ * @return SingularValueDecomposition(U, s, V)
+ */
+ def computeSVD(
+ k: Int,
+ computeU: Boolean = false,
+ rCond: Double = 1e-9): SingularValueDecomposition[RowMatrix, Matrix]
= {
+ if (numCols() < 100) {
--- End diff --
Please add quick comment on why 100 is used, to an unfamiliar person it
looks like a magic number.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---