Github user johnc1231 commented on the issue:
https://github.com/apache/spark/pull/17459
Did some thinking about this, and I think that to make the API cleaner
maybe we could deprecate the regular `toBlockMatrix` method and add
`toBlockMatrixSparse`. Until it's removed, we could just have `toBlockMatrix`
call `toBlockMatrixSparse`. I think that'd be more explicit and make it clearer
for users what kind of `BlockMatrix` they're creating. After that I think it'd
be easy enough for me to abstract out the DenseMatrix creation step of
`toBlockMatrixDense` and make it a general purpose helper method that
`toBlockMatrixDense` and `toBlockMatrixSparse` can call to create the specific
type of `BlockMatrix` they want.
I think this explicitness is important since it seems a lot of users create
a `BlockMatrix` through these methods as opposed to with a `BlockMatrix`
constructor since it's kind of a hard constructor to use (official Spark docs
also suggest to users that it's easier to use `toBlockMatrix` than to attempt
to use constructors:
https://spark.apache.org/docs/latest/mllib-data-types.html#blockmatrix).
If we don't want to go the deprecation route, we could have `toBlockMatrix`
take an argument specifying whether the data is sparse or dense, but I think
that should be an explicitly required argument since it's otherwise easy to
create something unintended.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]