Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17459
> I considered having toBlockMatrix check if the rows of IndexedRowMatrix
were dense or sparse, but there is no guarantee of consistency. Like, an
IndexedRowMatrix could be a mix of Dense and Sparse Vectors. In that case, it
would not be clear what type of BlockMatrix to create. A decent approximation
of this would be to just decide the matrix type based on the first vector we
look at in the iterator we get from groupByKey, creating a mix of Dense and
Sparse matrices in a BlockMatrix, but I still think it's best to be explicit.
Also, we currently have the description of toBlockMatrix promising to make a
BlockMatrix backed by instances of SparseMatrix, so we have made promises to
users about the composition of the BlockMatrix before.
I don't mean we don't care about it. I meant there is no guarantee that
`BlockMatrix` is purely consisted of `DenseMatrix` or `SparseMatrix`. It could
be a mix of them.
Thus, we can have a `toBlockMatrix` which creates a `BlockMatrix` which is
a mix of `DenseMatrix` and `SparseMatrix`. A block in a `BlockMatrix` can be a
`DenseMatrix` and `SparseMatrix`, depending on the ratio of values in the
block. Yes, it is like `a decent approximation` you talked.
For a `IndexedRowMatrix` completely consisted of `DenseVector`, this
`toBlockMatrix` definitely returns a `BlockMatrix` backed by `DenseMatrix`. For
other cases, `DenseMatrix` might not be best choice for all blocks in the
`BlockMatrix`, as many blocks will be sparse.
About the promise that `toBlockMatrix` makes a `BlockMatrix` backed by
instances of `SparseMatrix`, as I said it is not explicitly bound to the API
level. I think it is not a big problem.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]