Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/8278#issuecomment-132213926
@jkbradley @mengxr
I am facing some issues with hashing when isTransposed is not the same.
This is all right with the case of Dense Matrices because hashing the same
64 rowIndices, colIndices and values is not difficult
(https://github.com/apache/spark/pull/8278/files#diff-3c938d5dd2a742291f0a17a87f4f9b58R291)
However while hashing SparseMatrices for equality, it is not that easy. For
example, since we need to hash the same 64 rowindices, colIndices and values
for the same SparseMatrices (both with isTransposed set True and False), the
pseudocode when isTransposed set True goes like this.
val sm = new SparseMatrix(3, 2, Array(0, 1, 3, 5), Array(1, 0, 1, 0,
1), Array(3.0, 1.0, 4.0, 2.0, 5.0), true)
# Set colInd to zero.
# Search for all indices of colInd in rowIndices (which is (1, 3), when
colInd is 0)
# For ptr in these Indices ((1, 3))
# Scan colPtrs for the two numbers in between which ptr lies
# The lowest of the two numbers gives the rowInd
# Hash rowInd
# Hash colInd
# Increment colInd
Only this would verify that the hash for the transposed and non-transposed
case is equal.
Do you have any other suggestions?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]