Github user rnowling commented on the pull request:
https://github.com/apache/spark/pull/1248#issuecomment-47818425
Sean,
I updated the code to factor out common bits into a KMeansCommons file,
using traits for both the objects and classes. I updated the KMeansMiniBatch
tests so they are customized for the KMeansMiniBatch, don't duplicate testing
of common code, and account for the stochastic nature by using an epsilon for
the errors instead of directly comparing the floats. I also realized that I
failed to implement a key part of the MiniBatch algorithm so that is now
included.
Please review again.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---