Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/296#issuecomment-39881873
Detecting empty rows needs a join, which is quite expensive. Also, adding
empty rows will hurt performance if there are really many empty rows. I believe
in most cases, if a user want to compute covariance on an `IndexedRowMatrix`,
he/she means the covariance of observed/non-empty rows.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---