Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/3022#issuecomment-68312524
@tgaloppo Thanks for the updates, and thanks for all of your work in
getting this ready!
LGTM
CC: @mengxr
After this is merged, I'll make some JIRAs for the various item we've
discussed along the way + a few more. Let me know if I've missed anything here:
* Add parameters: seed, maxIterations
* Use sparse vectors more efficiently
* If numFeatures or k are large, distribute matrix inverses for Gaussian
initialization.
* Breeze pinv fails when the matrix is singular:
[https://github.com/scalanlp/breeze/issues/304] Do SVD instead.
* Make MultivariateGaussian public, and update GMM API
* Check for NaNs:
* in computeSoftAssignments (if all pdfs = 0)
* in values when constructing a GMM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]