[GitHub] spark issue #21927: [SPARK-24820][Core] Fail fast when submitted job contain...

squito Tue, 31 Jul 2018 14:41:17 -0700

Github user squito commented on the issue:

    https://github.com/apache/spark/pull/21927
  
    > Second thought: PartitionPruningRDD is just an implementation of RDD. 
Every user / developer can implement a similar one. Also this doesn't handle 
the case mentioned by @felixcheung : a.union(b).barrier(). So I'm thinking 
about checking number of partitions instead of instances of PartitionPruningRDD 
in this PR. Basically, we check the input RDD and all its parents have the same 
number of partitions. If not, we throw an error message like "Barrier execution 
mode doesn't support partition union / pruning.". Thoughts?
    
    yeah thats a good point, but what about `coalesce()`??  that should 
actually work, shouldn't it?  Maybe you'd add an exception for `CoalescedRDD`, 
or add another property for `processAllInputPartitions` or something ...



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21927: [SPARK-24820][Core] Fail fast when submitted job contain...

Reply via email to