> I have implemented/overriden routeCompositeDataMovementEventToDestination but 
> it isn't getting called. I'm raising DataMovementEvents though (and not 
> composite ones), so it might be expected?

I'm not sure if you're raising compose DMEs or not, so I'm speaking purely from 
the point of view of Hive + a simple shuffle edge with auto-reducer parallelism.

> - Difference between the overloads of routeDataMovementEventDestination (is 
> any of them depreciated?)

They're all called from different codepaths, so they are all used, but probably 
the additions haven't gone back and changed existing calls.

You should probably breakpoint in maybeAddTezEventForDestinationTask() and see 
where it goes.

There are people on this list who these APIs more often than I do - perhaps a 
bit of API housekeeping might help make it more consistent, without having to 
break too many bits (since the OnDemand is abstract and not an interface, it 
should be able to change its internals to chain the APIs together, calling them 
in hierarchy to handle all override cases together).

> - Difference between EdgeManagerPlugin and EdgeManagerPluginOnDemand

https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/Edge.java#L271

> - Are there any scale advantages of using CompositeDataMovementEvent vs 
> DataMovementEvent? My naive understanding says that the former is more of a 
> convenience thing and from a scale point of view there maybe no difference.

That really depends on whether you are using Composite events as-is - the 
getEvents() is an Iterable, so there is a definite scale advantage in sending 
composite events over the wire instead of sending 1000 copies of the same 
payload.

See the code in

https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/Edge.java#L433
+
https://github.com/apache/tez/blob/master/tez-api/src/main/java/org/apache/tez/runtime/api/events/CompositeDataMovementEvent.java#L111

Cheers,
Gopal    
    




Reply via email to