Hi, I've heard a lot of complain about spark's "pull" style shuffle. Is there any plan to support "push" style shuffle in the near future?
Currently, the shuffle phase must be completed before the next stage starts. While, it is said, in Impala, the shuffled data is "streamed" to the next stage handler, which greatly saves time. Will spark support this mechanism one day? Thanks