[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

jiangxb1987 Thu, 12 Jul 2018 22:06:47 -0700

Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/21698
  
    IIUC the output produced by `rdd1.zip(rdd2).map(v => (computeKey(v._1, 
v._2), computeValue(v._1, v._2)))` shall always have the same cardinality, no 
matter how many tasks are retried, so where is the data loss issue?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

Reply via email to