[GitHub] spark issue #21754: [SPARK-24705][SQL] ExchangeCoordinator broken when dupli...

carsonwang Thu, 02 Aug 2018 08:03:52 -0700

Github user carsonwang commented on the issue:

    https://github.com/apache/spark/pull/21754
  
    This LGTM as a fix. However, ideally we should also support reusing an 
exchange used in different joins. There is no need to shuffle write the same 
table twice, we just need read it differently. For example in one stage, a 
reducer may read partition 0 to 2, while in another stage a reducer may read 
partition 0 to 3. We just need a different partitionStartIndices to form a 
different ShuffledRowRDD, then we can reuse the Exchange. I should have 
addressed this in my new implementation of adaptive execution, @cloud-fan, 
let's pay attention to it when we reviewing that pr.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21754: [SPARK-24705][SQL] ExchangeCoordinator broken when dupli...

Reply via email to