Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5081#discussion_r26664038
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala 
---
    @@ -100,6 +100,50 @@ sealed trait Matrix extends Serializable {
        *          corresponding value in the matrix with type `Double`.
        */
       private[spark] def foreachActive(f: (Int, Int, Double) => Unit)
    +
    +  override def hashCode(): Int = {
    +    var result: Int = numRows * numCols + 31
    +    this.foreachActive { case (rowInd, colInd, value) =>
    +      // ignore explict 0 for comparison between sparse and dense
    +      if (value != 0) {
    +        result = 31 * result + rowInd + (numRows * colInd)
    +        // refer to {@link java.util.Arrays.equals} for hash algorithm
    +        val bits = java.lang.Double.doubleToLongBits(value)
    +        result = 31 * result + (bits ^ (bits >>> 32)).toInt
    +      }
    +    }
    +    result
    +  }
    +
    +  override def equals(other: Any): Boolean = {
    +    other match {
    +      case mat: Matrix =>
    +        if (mat.numRows != this.numRows || mat.numCols != this.numCols) 
return false
    +        (this, mat) match {
    +          case (dm1: DenseMatrix, dm2: DenseMatrix) =>
    +            Arrays.equals(dm1.toArray, dm2.toArray)
    +          case (sm1: SparseMatrix, sm2: SparseMatrix) =>
    +            // For the case in which one matrix is CSC and the other is CSR
    +            // the values, colPtrs and rowIndices need not be the same.
    +            // When both matrices are of the same type, it is sufficient 
to check that
    +            // the values, colPtrs and rowIndices are the same.
    +            if (sm1.isTransposed != sm2.isTransposed) {
    +              if (sm1.values.length != sm2.values.length) return false
    +              sm1.foreachActive {
    +                case (i, j, value) => if (value != sm2(i, j)) return false
    +              }
    +            } else {
    +                if (sm1.values != sm2.values) return false
    --- End diff --
    
    This is not correct because one matrix may contain explicit zeros. Please 
include this case in the unit test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to