Github user brkyvz commented on a diff in the pull request:
https://github.com/apache/spark/pull/2451#discussion_r17765187
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala ---
@@ -197,4 +201,452 @@ private[mllib] object BLAS extends Serializable {
throw new IllegalArgumentException(s"scal doesn't support vector
type ${x.getClass}.")
}
}
+
+ // For level-3 routines, we use the native BLAS.
+ private def nativeBLAS: NetlibBLAS = {
+ if (_nativeBLAS == null) {
+ _nativeBLAS = NativeBLAS
+ }
+ _nativeBLAS
+ }
+
+ /**
+ * C := alpha * A * B + beta * C
+ * @param transA whether to use the transpose of matrix A (true), or A
itself (false).
+ * @param transB whether to use the transpose of matrix B (true), or B
itself (false).
+ * @param alpha a scalar to scale the multiplication A * B.
+ * @param A the matrix A that will be left multiplied to B. Size of m x
k.
+ * @param B the matrix B that will be left multiplied by A. Size of k x
n.
+ * @param beta a scalar that can be used to scale matrix C.
+ * @param C the resulting matrix C. Size of m x n.
+ */
+ def gemm(
+ transA: Boolean,
+ transB: Boolean,
+ alpha: Double,
+ A: Matrix,
+ B: Matrix,
+ beta: Double,
+ C: DenseMatrix): Unit = {
+ if (alpha == 0.0) {
+ logDebug("gemm: alpha is equal to 0. Returning C.")
+ } else {
+ A match {
+ case sparse: SparseMatrix =>
+ B match {
+ case dB: DenseMatrix => gemm(transA, transB, alpha, sparse,
dB, beta, C)
+ case sB: SparseMatrix =>
+ throw new IllegalArgumentException(s"gemm doesn't support
sparse-sparse matrix " +
+ s"multiplication")
+ case _ =>
+ throw new IllegalArgumentException(s"gemm doesn't support
matrix type ${B.getClass}.")
+ }
+ case dense: DenseMatrix =>
+ B match {
+ case dB: DenseMatrix => gemm(transA, transB, alpha, dense, dB,
beta, C)
+ case sB: SparseMatrix => gemm(transA, transB, alpha, dense,
sB, beta, C)
+ case _ =>
+ throw new IllegalArgumentException(s"gemm doesn't support
matrix type ${B.getClass}.")
+ }
+ case _ =>
+ throw new IllegalArgumentException(s"gemm doesn't support matrix
type ${A.getClass}.")
+ }
+ }
+ }
+
+ /**
+ * C := alpha * A * B + beta * C
+ *
+ * @param alpha a scalar to scale the multiplication A * B.
+ * @param A the matrix A that will be left multiplied to B. Size of m x
k.
+ * @param B the matrix B that will be left multiplied by A. Size of k x
n.
+ * @param beta a scalar that can be used to scale matrix C.
+ * @param C the resulting matrix C. Size of m x n.
+ */
+ def gemm(
+ alpha: Double,
+ A: Matrix,
+ B: Matrix,
+ beta: Double,
+ C: DenseMatrix): Unit = {
+ gemm(false, false, alpha, A, B, beta, C)
+ }
+
+ /**
+ * C := alpha * A * B + beta * C
+ * For `DenseMatrix` A.
+ */
+ private def gemm(
+ transA: Boolean,
+ transB: Boolean,
+ alpha: Double,
+ A: DenseMatrix,
+ B: DenseMatrix,
+ beta: Double,
+ C: DenseMatrix): Unit = {
+ val mA: Int = if (!transA) A.numRows else A.numCols
+ val nB: Int = if (!transB) B.numCols else B.numRows
+ val kA: Int = if (!transA) A.numCols else A.numRows
+ val kB: Int = if (!transB) B.numRows else B.numCols
+ val tAstr = if (!transA) "N" else "T"
+ val tBstr = if (!transB) "N" else "T"
+
+ require(kA == kB, s"The columns of A don't match the rows of B. A:
$kA, B: $kB")
+ require(mA == C.numRows, s"The rows of C don't match the rows of A. C:
${C.numRows}, A: $mA")
+ require(nB == C.numCols,
+ s"The columns of C don't match the columns of B. C: ${C.numCols}, A:
$nB")
+
+ nativeBLAS.dgemm(tAstr, tBstr, mA, nB, kA, alpha, A.values, A.numRows,
B.values, B.numRows,
+ beta, C.values, C.numRows)
+ }
+
+ /**
+ * C := alpha * A * B + beta * C
+ * For `SparseMatrix` A.
+ */
+ private def gemm(
+ transA: Boolean,
+ transB: Boolean,
+ alpha: Double,
+ A: SparseMatrix,
+ B: DenseMatrix,
+ beta: Double,
+ C: DenseMatrix): Unit = {
+ val mA: Int = if (!transA) A.numRows else A.numCols
+ val nB: Int = if (!transB) B.numCols else B.numRows
+ val kA: Int = if (!transA) A.numCols else A.numRows
+ val kB: Int = if (!transB) B.numRows else B.numCols
+
+ require(kA == kB, s"The columns of A don't match the rows of B. A:
$kA, B: $kB")
+ require(mA == C.numRows, s"The rows of C don't match the rows of A. C:
${C.numRows}, A: $mA")
+ require(nB == C.numCols,
+ s"The columns of C don't match the columns of B. C: ${C.numCols}, A:
$nB")
+
+ val Avals = A.values
+ val Arows = if (!transA) A.rowIndices else A.colPtrs
+ val Acols = if (!transA) A.colPtrs else A.rowIndices
+
+ // Slicing is easy in this case. This is the optimal multiplication
setting for sparse matrices
+ if (transA){
+ var colCounterForB = 0
+ if (!transB) { // Expensive to put the check inside the loop
+ while (colCounterForB < nB) {
+ var rowCounterForA = 0
+ val Cstart = colCounterForB * mA
+ val Bstart = colCounterForB * kA
+ while (rowCounterForA < mA) {
+ var i = Arows(rowCounterForA)
+ val indEnd = Arows(rowCounterForA + 1)
+ var sum = 0.0
+ while (i < indEnd) {
+ sum += Avals(i) * B.values(Bstart + Acols(i))
+ i += 1
+ }
+ val Cindex = Cstart + rowCounterForA
+ C.values(Cindex) = beta * C.values(Cindex) + sum * alpha
+ rowCounterForA += 1
+ }
+ colCounterForB += 1
+ }
+ } else {
+ while (colCounterForB < nB) {
+ var rowCounter = 0
+ val Cstart = colCounterForB * mA
+ while (rowCounter < mA) {
+ var i = Arows(rowCounter)
+ val indEnd = Arows(rowCounter + 1)
+ var sum = 0.0
+ while (i < indEnd) {
+ sum += Avals(i) * B(colCounterForB, Acols(i))
+ i += 1
+ }
+ val Cindex = Cstart + rowCounter
+ C.values(Cindex) = beta * C.values(Cindex) + sum * alpha
+ rowCounter += 1
+ }
+ colCounterForB += 1
+ }
+ }
+ } else {
+ // Scale matrix first if `beta` is not equal to 0.0
--- End diff --
The matrix multiplication algorithms are different. If A is transposed, the
multiplication is very easy. A is in CSC format, therefore when you multiply
something with A', we get the column indices for free. Then it is your high
school matrix multiplication, but just using non-zero elements. Notice here
that we update the values of C only once, therefore you can scale it inside the
loop.
When A is not transposed, then you multiply the columns of A with the
corresponding rows of B and add it to C. Every multiplication of the column of
A and the row of B has the size C. Then it turns out you update the values of C
multiple times inside the loop. Therefore if you wanted to have an initial
scaling of C, you have to put it before the loop.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]