[ https://issues.apache.org/jira/browse/TEZ-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030381#comment-14030381 ]
Bikas Saha commented on TEZ-1170: --------------------------------- bq. there's places where the transitions out of INITIALIZING and INITED are inconsistent. Every place is consistent to the extent that they all call VertexInitializedTransition.doTransition(). But as you observe the timing is different, new event vs direct transition. Direct transition is not possible in the cases where setParallelism() results in initializing the parallelism since that method may be called from within the Vertex Managers. So the setParallelism case uses the READY_TO_INIT event approach to execute VertexInitializedTransition. All cases where initialization is waiting for setParallelism now consistently use the setParallelism method to complete the initialization. This is VertexManagers setting parallelism - custom vertex manager case, AM split generation input init case and source 1-1 split handling. There are 2 other cases where we need to execute the VertexInitializedTransition - when init is triggered because a null edge is initialized and the AM split distributor case. In these cases setParallelism is not called. I could make them consistent by having them send the READY_TO_INIT event but that would put it on the event queue. I hope that clarifies the reason the mixed event and direct transitions. If you have any ideas on making setParallelism execute the transition directly then we can try to unify them. Or we could have the direct transitions change to the event route. bq. V_PARALLELISM_INITIALIZED was a better name I renamed it because I thought I would use the event approach consistently in which case it would be sent for null edge init also and so the parallelism name would be misleading. Also before the event is sent not only parallelism but also other preconditions are checked like all edges defined etc. Thats why associating the event with only parallelism would be misleading. How about changing the name to V_INITIALIZATION_COMPLETE? bq..addTransition(VertexState.NEW, bq. This is a little confusing - since NullEdgeInitializedTransition I should have removed that transition. I put it there since I had a hunch that we might have a race in our init where the edge may be initialized before the vertex receives an init event. But it does not look like that should occur as long as all vertex related events happen in the same dispatcher (which may be a vertex only dispatcher). A predecessor vertex's init event must come before that vertex's post init events. I will remove that transition. bq..addTransition(VertexState.TERMINATING, bq, shut down the threads which are initializing, a shutdown is already being invoked when in th Yes. This is to ensure that shutdown is called on the initializers if the vertex is say killed by the client after parallelism is set but some initializers are not yet done. This might be a dead transition. If you agree then we can remove it. bq. ONE_TO_ONE_SOURCE_SPLIT expected in RUNNING state Yes. e.g. lets say all the parallelism are set (ie no need for initializing transition) and all vertices enter running stage. They dont actually start tasks because that is determined by the vertex manager (eg. slow start) Lets say the previous vertex has its parallelism changed due to auto-reduce. Now this vertex needs to change its parallelism to match that. If it has not yet started its tasks then its fine to do so. If it has already started its tasks then changing parallelism will error out. bq. assuming this was left in place so that tasks don't start, while they may not have events ready for the Inputs, or may not have generated data Case 1 - When input init is determining parallelism then the vertex moves out of initialization state once parallelism is set. This assumes that parallelism will be set only after all initializers are done. If that assumption is incorrect we will have to allow RootInputInitializedTransition() in the INITED and RUNNING states too. I did not add those transitions because thats not the way it works today. Case 2 - When input init is not determining parallelism, only then the RootInputInitializationTransition itself waits for all initializers to complete before moving out of the Initializing state. This is because we dont know what the initializers are doing eg. if they are updating the payload/specs then is it done or not etc. So to be safe we wait for all intializers to finish when the initializers are not determining parallelism. > Simplify Vertex Initializing transition > --------------------------------------- > > Key: TEZ-1170 > URL: https://issues.apache.org/jira/browse/TEZ-1170 > Project: Apache Tez > Issue Type: Improvement > Reporter: Siddharth Seth > Assignee: Bikas Saha > Fix For: 0.5.0 > > Attachments: TEZ-1170.1.patch, TEZ-1170.2.patch > > > After TEZ-1145 and 1151, a vertex should only need to stay in INITIALZING > state when it has an uninitialized edge, or when the parallelism is at -1 > (not set yet). Waiting for all RootInputInitializers to complete should not > be required - as long as one of them sets parallelism (via a VertexManager). -- This message was sent by Atlassian JIRA (v6.2#6252)