[ 
https://issues.apache.org/jira/browse/TEZ-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingda Chen reassigned TEZ-4000:
--------------------------------

    Assignee: Yingda Chen

> Enable downstream vertex connecting to an EPHEMERAL data source, to reason 
> about network connections of upstream tasks.
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-4000
>                 URL: https://issues.apache.org/jira/browse/TEZ-4000
>             Project: Apache Tez
>          Issue Type: Task
>            Reporter: Yingda Chen
>            Assignee: Yingda Chen
>            Priority: Major
>
> This is an umbrella task for TEZ-3997
> Another property that is usually shared with CONCURRENT on the same edge is 
> EPHEMERAL data source. When two vertices are running concurrently, direct 
> communications between tasks in those vertices become possible, and 
> oftentimes necessary, throughout the lifetime of the running task. This can 
> be articulated by an EPHEMERAL data sources, and this change aims to support 
> such scenarios, which are readily found in real-time applications (such as 
> interactive query) and/or customized applications that would like to control 
> their own data communications (such as parameter-server).
>  
> This change will allow Tez to be the central orchestrator that gathers 
> necessary network information from all upstream tasks, compiles them together 
> and send it to downstream tasks. Particularly, the following changes are 
> planned:
>  # For two vertices connected via an edge with both CONCURRENT scheduling 
> type and EPHEMERAL data source type, the task in upstream vertex will open 
> network port, and send an VertexManagerEvent(VME) immediately upon running. 
> The payload of VME includes necessary information to communicate to this task 
> through direct network communication (such as ip and port). The vertex 
> manager of the downstream vertex, typed VertexManagerWithConcurrentInputs, 
> will receive these VMEs, and are responsible for aggregate (including de-dup 
> if necessary) all information together in onVertexManagerEventReceived(). 
>  # Once all VMEs have been received, a CustomProcessorEvent will be 
> constructed with a payload that includes the aggregated information, and be 
> routed to downstream tasks.
> The change will introduce additional optional entries in 
> VertexManagerEventPayload and a new custom payload that will be embedded into 
> CustomProcessorEvent. 
>  
> Upon completion of functional feature in this change, additional feature such 
> as handling of failover in CONCURRENT/EPHEMERAL edge will be addressed in 
> future umbrea JIRAs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to