[jira] [Commented] (TEZ-1914) VertexManager logic should not run on the central dispatcher

Siddharth Seth (JIRA) Mon, 09 Feb 2015 13:23:06 -0800

    [ 
https://issues.apache.org/jira/browse/TEZ-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312904#comment-14312904
 ]


Siddharth Seth commented on TEZ-1914:
-------------------------------------

bq. That cannot be because the vertex manager plugin has until the initialize() 
invocation to notify the framework about intent to reconfigure the vertex. If 
this happens async then the vertex will move ahead without the notification.
This can be handled via a VERTEX_MANAGER_INITIALIZING (or equivalent) state in 
the Vertex state machine right ? Moving all functionality to a separate thread 
except for initialize doesn't seem correct.

Completely missed this in the last review. Invocations into the 
VertexManagerPlugin can happen on a random thread in the threadpool. That 
essentially means that VMPlugins need to be aware of this, and likely 
synchronize to make sure things work correctly - this is primarily from a 
visibility POV. Most of the VMs within the Tez codebase itself have the 
potential to break right now. I'm guessing external ones in Pig/Hive/other 
haven't been written keeping multiple threads in mind. We could either 
synchronize the invocations into the VMPlugins or use thread affinity to ensure 
the same VMPlugin always uses the same thread.
Before the patch, everything would happen on the same thread - so there wasn't 
really a problem.

bq. For now yes, though I would like to explore the option of using these more 
generally where a dispatcher could run callable events on a multi-threaded 
executor.
I still don't understand how these will be used eventually, but sure - I guess 
you have something in mind for these in the future. Depending on how this is 
used - CallableEvent will likely need to be parameterized in the future, 
instead of handling Void as the only return type.

Accidental changes to TaskImpl ?

Nit: It'd be better to move pluginContext.isComplete into 
enqueueAndScheduleNextEvent like pluginFailed.get - instead of checking it in 
each method.

Nit: A single instance of VertexManagerCallback can be used for all 
invocations, instead of creating a new one for each event. There's no per-call 
state information in there (not right now anyway)

{code}
-      dispatcher.stop();
+      //dispatcher.stop();
{code}
Accidental change in the test ?

Rest looks good to me.



> VertexManager logic should not run on the central dispatcher
> ------------------------------------------------------------
>
>                 Key: TEZ-1914
>                 URL: https://issues.apache.org/jira/browse/TEZ-1914
>             Project: Apache Tez
>          Issue Type: Task
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: TEZ-1914.1.patch, TEZ-1914.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-1914) VertexManager logic should not run on the central dispatcher

Reply via email to