[ 
https://issues.apache.org/jira/browse/TEZ-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2411:
----------------------------
    Description: Today the AM creates a new DataMovement event from the 
original event sent by the producer task and supplements the new event with 
source/target indices for the consumer task. This new event creation can be 
offloaded to the task runtime and thus save CPU cycles on the AM for the object 
creation. Secondly, the original event can be kept in serialized form inside 
the AM and sent as is to the task over the RPC, thus potentially saving serde 
CPU for these events in addition to the object creation CPU. This can help when 
there is a high concurrency of running tasks in a job. Say 10000 tasks running 
in parallel and sending events to the AM.  (was: Today the AM creates a new 
DataMovement event from the original event sent by the producer task and 
supplements the new event with source/target indices for the consumer task. 
This new event creation can be offloaded to the task runtime and thus save CPU 
cycles on the AM for the object creation. Secondly, the original event can be 
kept in serialized form inside the AM and sent as is to the task over the RPC, 
thus potentially saving serde CPU for these events in addition to the object 
creation CPU.)

> Offload DataMovement event creation from the AM to the tasks
> ------------------------------------------------------------
>
>                 Key: TEZ-2411
>                 URL: https://issues.apache.org/jira/browse/TEZ-2411
>             Project: Apache Tez
>          Issue Type: Task
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>
> Today the AM creates a new DataMovement event from the original event sent by 
> the producer task and supplements the new event with source/target indices 
> for the consumer task. This new event creation can be offloaded to the task 
> runtime and thus save CPU cycles on the AM for the object creation. Secondly, 
> the original event can be kept in serialized form inside the AM and sent as 
> is to the task over the RPC, thus potentially saving serde CPU for these 
> events in addition to the object creation CPU. This can help when there is a 
> high concurrency of running tasks in a job. Say 10000 tasks running in 
> parallel and sending events to the AM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to