[
https://issues.apache.org/jira/browse/TEZ-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118342#comment-14118342
]
Bikas Saha commented on TEZ-145:
--------------------------------
Not sure what you mean by YARNRunner#createDAG with TaskLocationHint. It seems
like a VertexManager that does this at runtime (after being setup at compile
time) would be the way to go. Maybe it will clear things up if there is a doc
describing the solution so that we are all on the same page.
Gopal, it maybe that the notion of combiners is composable. So a combiner is a
commutative and associative function that can be applied in any order and any
number of times. Then the combiner could be run just after the mapper
(in-proc), independently (in combiner tasks), or just before the reducer
(in-proc) to trade off pure combiner work with overhead of doing it. Are we
thinking of the same thing?
> Support a combiner processor that can run non-local to map/reduce nodes
> -----------------------------------------------------------------------
>
> Key: TEZ-145
> URL: https://issues.apache.org/jira/browse/TEZ-145
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Hitesh Shah
> Assignee: Tsuyoshi OZAWA
> Labels: TEZ-1
>
> For aggregate operators that can benefit by running in multi-level trees,
> support of being able to run a combiner in a non-local mode would allow
> performance efficiencies to be gained by running a combiner at a rack-level.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)