Hitesh, With respect to the below comment: So a vertex will have a number of tasks, which is decided strictly based on the input data the vertex has to process ? Also, it is guaranteed that every task will have same input size ? (all except the last one probably).
Thanks, Robert Correct. The hierarchy is dag -> vertex -> task -> task attempt ( each relationship being a 1:N ). Vertex defines a stage of common processing logic applied on a parallel data set. A task represents processing of a subset of the data set. On Monday, July 7, 2014 10:37 AM, Hitesh Shah <[email protected]> wrote: Correct. The hierarchy is dag -> vertex -> task -> task attempt ( each relationship being a 1:N ). Vertex defines a stage of common processing logic applied on a parallel data set. A task represents processing of a subset of the data set. thanks — Hitesh On Jul 7, 2014, at 9:40 AM, Grandl Robert <[email protected]> wrote: > Another dumb question: A vertex can have multiple tasks(not task attempts), > for different input blocks, right ? So a vertex entity is kind of a stage > abstraction, not a task abstraction, right ? >
