Hey Gopal, using the task count basically for 2 things (in mr for both the map stage and the reduce stage): - each task samples its output-data up to a certain number. This number is the desired sample count divided by the number of tasks - also we use the task count in some scenarios to let the last task (of a stage or a vertex) do some extra logic. That plays in combination of the task-index.
Looking at your patch it looks like it will do the job for kind of the map-like vertex but not for the aggregation vertex, right ? Also what jira issue is that ? best Johannes On 24 Jul 2014, at 07:40, Gopal V <[email protected]> wrote: > On 7/23/14, 6:07 PM, Johannes Zillmann wrote: >> Hey Tez team, >> >> is there some way to get the task count within a vertex from within a task ? >> Some equivalent to mapred.map.tasks and mapred.reduce.tasks for map-reduce ? > > Could you explain the use-case for this particular requirement? > > I intend to add the vertex parallelism to the task context as part of one of > my WIP branches. > > I uploaded my base patch-set as is (including the TODO markers). > > https://issues.apache.org/jira/secure/attachment/12657536/TEZ-broadcast-shuffle%2Bvertex-parallelism.patch > > If you can explain what you are actually looking to do with this information, > perhaps I can roll the two feature reqs together. > > Cheers, > Gopal >
