[
https://issues.apache.org/jira/browse/TEZ-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189495#comment-14189495
]
Siddharth Seth commented on TEZ-1547:
-------------------------------------
Haven't looked at the unit tests yet; Comments on the patch.
- vertexReconfigurationPlanned - In the JavaDoc, mentioning that scheduleTasks
/ addInputInitilizerEvents does not qualify as re-configuration would be useful.
- public void doneReconfiguringVertex() - The same could be achieved by
assuming reconfiguration to be complete when setVertexParallelism is invoked -
since that can be done only once. Either works though, the method making it
implicit and users need to think about when to call it (which will typically be
right after setVertexParallelism).
- public void vertexManagerDone(); - Do we need to add this right now. It
isn't doing much at the moment, other than unregistering for notifications.
This, along with doneReconfiguringVertex can get confusing. Instead, we could
just expose unregisterForNotifications right now, and add this when it's
needed. e.g. multiple reconfigurations etc. doneReconfiguringVertex (or
setParallelism with intentToReconfigure) provides the necessary functionality
for VertexImpl to move forward. This one would indicate that it's possible to
reconfigure a vertex multiple times - which isn't possible today.
Just looking at the Hive VertexManager - knowing when to call this can be
complicated - all initializers completed.
- onVertexStateUpdated should throw Exception. This needs to be handled similar
to TEZ-1267 / TEZ-1703.
- COMPLETELY_CONFIGURED - This is only sent out after startVertex has been
invoked in VertexImpl (startTime set or within the startVertex() call). Calling
it STARTED/SCHEDULED would be better - since that is what it indicates. A
COMPLETELY_CONFIGURED event could otherwise go out the moment setParallelism is
called / the VMP indicates that it's done with configuration.
Haven't looked yet, but this may be the same as the RUNNING state update.
- ImmediateStartVertexManager. SourceInfo no longer required. One question I
had when reading the unpatched code was why SCATTER_GATHER edges were not
considered. The patch removes that limitation. Any insights on why ?
- VertexImpl "if (fromVertexManager && canInitVertex()) {" - this check should
ideally be after the if (parallelismSet) check
- Typo: defune
- Between donReconfiguringVertex and startVertex() - I think it's possible for
the COMPLETELY_CONFIGURED event to go out twice. From within startVertex, the
VM gets called with onVertexStarted() - which ends up calling
doneReconfiguringVertex, which sets this.vertexToBeReconfiguredByManager and
sends out the events. startVertex() will send it again.
- if (fromVertexManager && canInitVertex()) { | This could be considered as an
incompatible change. If a vertex is setup with the MRInputSplitDistributor and
the correct number of tasks up front (PIG used to do this; I'm not sure if they
still do). Don't think there's a good way to workaround this though, if this
functionality is being added.
- Nit: s/unregisterForVertexStatusUpdates/unregisterForVertexStateUpdates
- Nit: Bunch of changes like (isComplete.get() == true), instead of just
(isComplete.get())
- Does everything on the VertexManager need to be synchronized ? Are we
guaranteeing thread safety in case the plugin decides to start threads ?
- Didn't look in detail, but there's no changes on InputReadyVertexManager.
That is intentional ? Nothing required, since it designed to schedules based on
src completion.
> Make use of state change notifier in VertexManagerPlugins
> ---------------------------------------------------------
>
> Key: TEZ-1547
> URL: https://issues.apache.org/jira/browse/TEZ-1547
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Siddharth Seth
> Assignee: Bikas Saha
> Attachments: TEZ-1547.1.patch, TEZ-1547.3.patch, TEZ-1547.4.patch,
> TEZ-1547.5.patch, TEZ-1547.6.patch, TEZ-1547.7.patch
>
>
> Instead of the various APIs like onVertexStarted, simple notifications could
> be sent.
> Some existing APIs could end up being deprecated.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)