Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/3022#issuecomment-68312524
  
    @tgaloppo  Thanks for the updates, and thanks for all of your work in 
getting this ready!
    
    LGTM
    
    CC: @mengxr 
    
    After this is merged, I'll make some JIRAs for the various item we've 
discussed along the way + a few more.  Let me know if I've missed anything here:
    * Add parameters: seed, maxIterations
    * Use sparse vectors more efficiently
    * If numFeatures or k are large, distribute matrix inverses for Gaussian 
initialization.
    * Breeze pinv fails when the matrix is singular: 
[https://github.com/scalanlp/breeze/issues/304]  Do SVD instead.
    * Make MultivariateGaussian public, and update GMM API
    * Check for NaNs:
     * in computeSoftAssignments (if all pdfs = 0)
     * in values when constructing a GMM



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to