Yup, hitting it with the included PySpark kmeans example (v0.8.1). So the code for reproducing is simple. But note that I only get it with pretty many nodes (in our set up, 30 or more). So you should see it if you run KMeans with that many nodes, on any fairly large data set with many iterations (e.g. 50GB, 20 iterations of kmeans, k=3).
Happy to try anything on our end to help debug... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Stalling-during-large-iterative-PySpark-jobs-tp492p952.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
