Github user mengxr commented on the issue:

    https://github.com/apache/spark/pull/22112
  
    Then it doesn't meet the requirements for those operations used by MLlib:
    * sampling
    * zipWithIndex, zipWithUniqueId
    * we also use zip, assuming the ordering from the source RDD is preserved, 
e.g., 
https://github.com/apache/spark/blob/e50192494d1ae1bdaf845ddd388189998c1a2403/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala#L403


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to