Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/21698
  
    IIUC the output produced by `rdd1.zip(rdd2).map(v => (computeKey(v._1, 
v._2), computeValue(v._1, v._2)))` shall always have the same cardinality, no 
matter how many tasks are retried, so where is the data loss issue?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to