[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

squito Fri, 10 Aug 2018 09:19:13 -0700

Github user squito commented on the issue:

    https://github.com/apache/spark/pull/21698
  
    > What if the user does't provide a distributed file system path? E.g., you 
can read from Kafka and write them back to Kafka and such workloads don't need 
a distributed file system in standalone mode.
    
    yeah that is a good point.  I think we want a solution which is correct 
without checkpointing (eg. always sort), but perhaps can leverage checkpointing 
when possible to avoid the overhead.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

Reply via email to