Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/7177#issuecomment-118370370 @mengxr @jkbradley Yes, the cluster assignments is deterministic subject to numerical difference. The current tests is deterministic right now just like the test case in Scala. It does not need a random seed parameter if the test data is sufficient and it will produce deterministic result. The current fixed doctest command is similar to [GaussianMixture](https://github.com/apache/spark/blob/master/python/pyspark/mllib/clustering.py#L178).
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org