Bit of both.
The semantics of 1-1 or scatter-gather edges require sources to start before downstream to start. Semantics for broadcast allows downstreams to start as soon as upstreams can start. This is because scheduling prioritizes upstream work before downstream work. So ordering will be maintained and downstream can start asap and be able to pull the broadcasted data which is usually unordered. So a partial shard is good enough to start processing. While the same would be true for 1-1 and SG, this immediate start behavior is not preferred because 1-1 has locality constraints to try and run local to the sole parent and SG typically has ordered input and so reading shards from partial sources is not going to be good enough. This heuristic is optimized for quicker scheduling and low latency. However, what this does not take into account are transitive dependencies like the one in your graph where R8 may be blocked from starting due to its predecessor while R3 can start because it assumes R8 would have started. Thus we can have priority inversion in the scheduling which can temporarily stall your job is the cluster does not have enough resources to run all these tasks at the same time. At that point preemption will kick in a resolve the priority inversion. So the system will self-restore. We are actively working on improving this to minimize the preemption while still maintaining low latency. Bikas *From:* Grandl Robert [mailto:[email protected]] *Sent:* Friday, September 12, 2014 11:21 AM *To:* [email protected] *Subject:* ShuffleVertexManager looks only for input vertices with DataMovement being SCATTER_GATHER Hi guys, During some of my experiments, I realized that a vertex which is managed by a ShuffleVertexManager is looking for tasks who have finished just in parent vertices where data movement is SCATTER_GATHER. For example, in the attached DAG, Reducer 3 is able to start tasks looking just at Map_5 and Reducer_2, and such even if none of the tasks have finished on the branch with Reducer 8, Reducer 3 still starts. The main reason seems to be that a ShuffleVertexManager is looking for tasks finished just in bipartiteSources vertices, which seems to be only those which are SCATTER_GATHER. So a parent which is Broadcast, is completely ignored from this. Do I miss something, i.e. it is as designed or it is a bug ? Thanks, Robert -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
