[ 
https://issues.apache.org/jira/browse/TEZ-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191181#comment-14191181
 ] 

Bikas Saha commented on TEZ-1547:
---------------------------------

Fixed the race condition. Added test cases for that.

bq. vertexReconfigurationPlanned - In the JavaDoc, mentioning that 
scheduleTasks / addInputInitilizerEvents does not qualify as re-configuration 
Done.

bq. public void doneReconfiguringVertex() - The same could be achieved by 
assuming reconfiguration
Its actually already done that way to make the common case easier. An 
incompletely configured vertex will implicitly send the notification when its 
completely configured. Calling these API's is optional for the common case. The 
new API's are needed only when a fully configured vertex is re-configured. E.g. 
Shuffle vertex parallelism is set and auto-parallel can choose to not change 
the parallelism and setParallelism() will not be invoked. However, 
doneReconfiguringVertex() must be invoked to let the vertex know that there is 
not going to be any further reconfiguring and it sends the notification at that 
time. This is also going to enable us to quickly add multiple invocations of 
setParallelism() without encountering the current issues because the 
doneReconfiguringVertex() can be used to notify the vertex that reconfiguration 
is finally complete.

bq. public void vertexManagerDone(); - Do we need to add this right now.
This has been added for multiple reasons, not just for event unregistering. 
Recovery needs to know when a vertex manager is done doing all things like 
sending events etc. However, given that without the locking fix this is not 
going to work. And given that its an optional API and things work without it, I 
am going to back it out for this patch.

bq. onVertexStateUpdated should throw Exception
Done. Those patches came in while this was in progress. So now its my problem :)

bq. COMPLETELY_CONFIGURED - This is only sent out after startVertex has been 
invoked in VertexImpl
Done this way because there are many path out of INITING. But all of those 
immediately go trigger startVertex(). So it was easier to put it there without 
any drawbacks. This would result in this and RUNNING coming one after another 
except for cases like auto-reduce/multiple changes in parallelism etc.

bq. ImmediateStartVertexManager. SourceInfo no longer required
Removed. I think it was that way because only in those cases the 
ImmediateStartManager is effective. If there is an SG edge then the shuffle 
vertex manager gets used. Doing it this way removes need for special casing in 
both vertex managers and also re-enabled the min/max=0 case for perf in the 
shuffle vertex manager.

bq. VertexImpl "if (fromVertexManager && canInitVertex()) {" - this check 
should ideally be af
Done. This is my last comment where I mentioned it needs to go inside the try 
block.

bq. Typo: defune
Will do.

bq. MRInputSplitDistributor and the correct number of tasks up front
>From what I see only MRInputAMSplitGenerator results in setParallelism to be 
>called. In which case numTasks will be -1. The distributor does not result in 
>setParallelism to be called. If needed, we could change the 
>RootInputVertexManager to use the new APIs and make this check pass. But dont 
>think thats needed now.

bq.  s/unregisterForVertexStatusUpdates/unregisterForVertexStateUpdates
Will do

bq. Does everything on the VertexManager need to be synchronized ?
IMO its fine to go with heavy handed synchronization for now since this is not 
a high throughput API and it allows VMs to safely call methods. We should have 
done this much earlier.

bq.  on InputReadyVertexManager. That is intentional ? Nothing required,
Right. Nothing required. This is the fully conservative scheduler.

Will post a revised patch soon.


> Make use of state change notifier in VertexManagerPlugins
> ---------------------------------------------------------
>
>                 Key: TEZ-1547
>                 URL: https://issues.apache.org/jira/browse/TEZ-1547
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: TEZ-1547.1.patch, TEZ-1547.3.patch, TEZ-1547.4.patch, 
> TEZ-1547.5.patch, TEZ-1547.6.patch, TEZ-1547.7.patch
>
>
> Instead of the various APIs like onVertexStarted, simple notifications could 
> be sent.
> Some existing APIs could end up being deprecated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to