> For instance I want to say that DataMovementEvent(s) from the tasks in the > source vertex should be routed to the tasks in the destination vertex based > on the fact whether the tasks are in the same rack or not (or for that matter > use some other key to route events between the tasks in the two stages).
There was an attempt at scheduling a combiner task in rack-local to speed up dedup ops (by doing per-rack aggregates) - https://issues.apache.org/jira/browse/TEZ-145 I'm wondering if you're trying to do something similar. > To do this I implemented my own EdgeManagerPluginOnDemand derivative but I > see it has two APIs for routing the events: I think you might not have overridden routeCompositeDataMovementEventToDestination(). If you want to submit a patch with additional log lines to the Tez Edge.java, I think that might be one place which is under-logged for these cases. Cheers, Gopal
