[
https://issues.apache.org/jira/browse/TEZ-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118428#comment-17118428
]
Attila Magyar commented on TEZ-4170:
------------------------------------
[~rajesh.balamohan], thanks for the review.
*_1. handleInitializerEvents should be invoked after the init has been done._*
There are 2 steps, first is creating the initialize wrappers, the 2nd is to
call the initialize() on them.
When do we need to call handleInitializerEvents()? After the 2nd step? Earlier
it was called right after the 1st step (we didn't wait for the 2nd (async) step
to be fully executed).
*_2. runInitializerAndProcessResult does not make use of Future. Possible to
loose the results._*
It uses the result directly internally: eventHandler.handle(new
VertexEventRootInputInitialized(vertex.getVertexId(),
initializer.getInput().getName(), result));
Getting a Future + registering a callback could achieve the same, I don't see a
difference in a case like this.
*_3. Any reason for changing VertexImpl? (as in, removing the exception
handling there and bringing to RootInputInitManager)_*
runInputInitializers() is fully asynchronous now, it won't effect the caller's
thread if it fails, therefore the failure is handled internally on the
executor's thread. In other words the TezException will never propagate to the
caller.
*_4. On success or failure, we need to unregister from vertex/task updates.
setComplete() needs to be invoked accordingly._*
thanks, fixed.
> RootInputInitializerManager could make use of ThreadPool from appContext
> ------------------------------------------------------------------------
>
> Key: TEZ-4170
> URL: https://issues.apache.org/jira/browse/TEZ-4170
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: Attila Magyar
> Priority: Major
> Attachments: Screenshot 2020-05-06 at 6.26.34 AM.png,
> TEZ-4170.1.patch, TEZ-4170.2.patch, TEZ-4170.3.patch, TEZ-4170.4.patch,
> TEZ-4170.5.patch, TEZ-4170.6.patch, TEZ-4170.7.patch
>
>
> [https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/RootInputInitializerManager.java#L106]
>
> This could make use of executor from {{appContext}} instead of spinning one
> for every root input.
>
> Also, initialization part of InputInitializer could be moved inside this
> thread. For e.g, in certain cases like HiveSplitGenerator, it ends up with
> some heavy operations which can be offloaded from blocking central dispatcher
> thread (e.g unpacking payloads, running kryo deserialization)
>
> !Screenshot 2020-05-06 at 6.26.34 AM.png|width=972,height=740!
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)