[
https://issues.apache.org/jira/browse/TEZ-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126582#comment-14126582
]
Siddharth Seth commented on TEZ-1539:
-------------------------------------
Please see my first comment about introducing different events for the 'send
anytime' semantics. InputInitializerEvents, if generated, can be setup to
always follow ONCE_ON_SUCCESS semantics. A new event type - a sub class if
required, can be used for 'send anytime'. Technically, it is possible for the
same II/VMP to handle the two modes - in which case a registration API does not
work.
For now, the only use cases we have is ONCE_ON_SUCCESS. Lets keep things
simple, and support this. When the need arises, or in a later release - we can
support the SEND_ANYTIME option. The events, in this case, will also change
with version etc information.
Clarification please
By VM, do you mean the user code (VMP or the VertexManager itself(VM)).
Similarly by II, do you mean the user code (II) or the manager (IIM).
Terminology I'm using:
VM - VertexManager
VMP - User vertex manager plugin
IIM - Input Initializer Manager
II - User Input Initializer
bq. 1) Events from tasks can come at any time and there may be use cases for
sending and receiving these events without delay. However, for the above
simplification, asap delivery cannot be used since the task may fail after
delivering the first event (in case its supposed to deliver 3). That is why, to
support this simple mode, the events must be sent to the VM/II upon first
successful completion of the task, so that we can guarantee that all events are
delivered only a for a successful task.
I'm not sure if that's a statement, suggestion, something broken in the patch ?
The patch does exactly this - buffering events and sending them when the source
task completes.
bq. 2) In addition, there may be use cases where VMs would want to receive
events from the all attempts of a task. I dont think we should restrict that in
the framework internals.
Not touching VertexManagerEvents here.
bq. 3) For the same DAG, some VMs/IIs may want the simpler behavior and some
may not. So this behavior needs to be local to those VMs/IIs that want it.
Assuming you meant VMP/II. Please see comment above about separating the events
to deal with this.
bq. 1) Not make any changes in the core state machines for this simplification.
Events continue to come in and get routed to the desired VMs and IIs. If they
need to be saved for recovery and are not being saved right now, then thats
something that needs to be handled orthogonally under the recovery umbrella
jira. This allows all the above cases to continue to work.
VMs / IIMs may not be setup when events are received from a task. Delegating
all the work over to them was an option, but there are cases where events may
come in before the targeted vertex has INITIALIZED (is still in state NEW -
question about this later). (We could of-course change VertexImpl to setup the
VM, IIM while in state NEW - but that's a separate jira.) In such cases, the
events need to be cached within the bounds of the state machine. This can be
done in a VMM or IIMM (instead of within VertexImpl), but that would still run
within the state machine - or would need to get notifications when states
change.
bq. 2) IIContext and VMContext should expose a method that allows VMPlugins and
IIPlugins to enable sendEventsOnceOnSuccess mode. This method should ideally be
called during initialization of the VMPlugin/IIPlugin.
See comment above about a single VMP, II working with two types of events.
bq. 3) IIManager and VMManager (the framework wrappers around the plugin code)
implement this behavior. These components are Tez framework code and should be
able to understand and handle attempts/retries etc. When events come to them to
be handed over to the plugins then, today, they simply route them to the
plugin. If the sendEventsOnceOnSuccess mode is enabled then they can buffer
events until the task attempt has completed with success and only then deliver
those events. If the task attempt fails then they can drop the buffered events.
After delivering the events they can drop any further events routed to them.
See comment above about needing to be aware of the state.
On remaining in state NEW:
In a long chain - consider V1-V10 to be a chain, where V10 defines an
initializer. V10 does not move into INITED state until all upstream vertices
have inited.
However it's possible for V1 to complete, before this entire chain has moved
forward. (Start with V1_INIT, V1_STARTED in the dispatcher queue - and move
from there).
Is there a reason we don't send INIT events from DAGImpl in the order of the
depth of a Vertex, and instead rely on source vertices sending INIT events the
moment they receive adequate INIT events from their sources. [~hitesh] ?
There's some comments surrounding this code related to recovery, I'm not sure
that has anything to do with this.
> Allow a FIRE_ONCE_ON_SUCCESS model for events generated by user code
> --------------------------------------------------------------------
>
> Key: TEZ-1539
> URL: https://issues.apache.org/jira/browse/TEZ-1539
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
> Attachments: TEZ-1539.1.wip.txt
>
>
> Specifically for InputInitalizerEvents and VertexManagerEvents.
> Pasting comment from TEZ-1447
> In a majority of cases, events generated by different attempts of the same
> task will be identical - in which case just making use of the event generated
> by the first successful attempt is adequate. Doing something like this manes
> that users don't worry about retries, indices etc - and can just rely on
> receiving a set of events which are to be processed once the vertex succeeds.
> If different attempts of the same workload generate different events -
> processing is likely to be incorrect, since it's very possible for all data
> to be processed (VERTEX successful), then a failure and retry - which
> generates a different event. The initializer doesn't even run at this point,
> since it's already done it's work and is complete. Handling such scenarios,
> likely involves re-running the entire initializer and re-starting the vertex
> which processed the event from scratch. In situations like this, where data
> generated may be different, the best bet is for speculation to be disabled
> (when it's supported), and max-attempts to be set to 1.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)