> I have implemented/overriden routeCompositeDataMovementEventToDestination but > it isn't getting called. I'm raising DataMovementEvents though (and not > composite ones), so it might be expected?
I'm not sure if you're raising compose DMEs or not, so I'm speaking purely from the point of view of Hive + a simple shuffle edge with auto-reducer parallelism. > - Difference between the overloads of routeDataMovementEventDestination (is > any of them depreciated?) They're all called from different codepaths, so they are all used, but probably the additions haven't gone back and changed existing calls. You should probably breakpoint in maybeAddTezEventForDestinationTask() and see where it goes. There are people on this list who these APIs more often than I do - perhaps a bit of API housekeeping might help make it more consistent, without having to break too many bits (since the OnDemand is abstract and not an interface, it should be able to change its internals to chain the APIs together, calling them in hierarchy to handle all override cases together). > - Difference between EdgeManagerPlugin and EdgeManagerPluginOnDemand https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/Edge.java#L271 > - Are there any scale advantages of using CompositeDataMovementEvent vs > DataMovementEvent? My naive understanding says that the former is more of a > convenience thing and from a scale point of view there maybe no difference. That really depends on whether you are using Composite events as-is - the getEvents() is an Iterable, so there is a definite scale advantage in sending composite events over the wire instead of sending 1000 copies of the same payload. See the code in https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/Edge.java#L433 + https://github.com/apache/tez/blob/master/tez-api/src/main/java/org/apache/tez/runtime/api/events/CompositeDataMovementEvent.java#L111 Cheers, Gopal
