[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

squito Mon, 13 Aug 2018 19:42:02 -0700

Github user squito commented on the issue:

    https://github.com/apache/spark/pull/21698
  
    I also think @tgravescs solution of using the HashPartitioner is an 
acceptable one, though as you've noted it doesn't deal w/ skew (which may be a 
lot of the existing use of `repartition()`).  I think we'd probably see a bunch 
of users complain that their jobs started crashing on upgrading 2.4 if thats 
the best we can offer, but IMO crash is way better than silent data loss.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

Reply via email to