Github user WeichenXu123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17968#discussion_r143465176
  
    --- Diff: python/pyspark/mllib/linalg/__init__.py ---
    @@ -1131,14 +1131,21 @@ def __getitem__(self, indices):
                 return self.values[i + j * self.numRows]
     
         def __eq__(self, other):
    +        def _eq(self, other):
    +            self_values = np.ravel(self.toArray(), order='F')
    +            other_values = np.ravel(other.toArray(), order='F')
    +
    +            return np.all(self_values == other_values)
    +
    +        if isinstance(other, SparseMatrix):
    +            return _eq(self, other)
    --- End diff --
    
    This implementation is incorrect here I think.
    You flatten 2D-arrays for both side, and then compare the two flattened 
array. it will result in `M*N` matrix equals `N*M` matrix when their values are 
the same.
    
    So you need move the judgement of
    ```
    if  (self.numRows != other.numRows or                                 
          self.numCols != other.numCols):
         return False
    ```
    into front.
    Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to