[ 
https://issues.apache.org/jira/browse/TEZ-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419567#comment-15419567
 ] 

Zhiyuan Yang edited comment on TEZ-3230 at 8/31/16 11:20 PM:
-------------------------------------------------------------

bq. CartesianProductEdgeManagerUnpartitioned#getNumDestinationConsumerTasks 
doesn't depend on sourceTaskIndex. So it could cache the value in 
initialization. Granted this isn' important given it is called only in the case 
of INPUT_READ_ERROR_EVENT.
Thanks for pointing this out! Already added into new patch.

bq. 
CartesianProductEdgeManagerPartitioned#routeCompositeDataMovementEventToDestination
 optimization. Instead of computing the partition from taskTaskId, we can store 
the destinationTaskIndex -> partition mapping. Then taskIdMapping becomes 
unnecessary.

This is a trade off between CPU and memory. IMO, memory is rarer resource than 
CPU. Given the profiling didn’t show significant CPU overhead, I’ll keep 
current implementation.


was (Author: aplusplus):
bq. CartesianProductEdgeManagerUnpartitioned#getNumDestinationConsumerTasks 
doesn't depend on sourceTaskIndex. So it could cache the value in 
initialization. Granted this isn' important given it is called only in the case 
of INPUT_READ_ERROR_EVENT.
Thanks for pointing this out! Already added into new patch.

bq. 
CartesianProductEdgeManagerPartitioned#routeCompositeDataMovementEventToDestination
 optimization. Instead of computing the partition from taskTaskId, we can store 
the destinationTaskIndex -> partition mapping. Then taskIdMapping becomes 
unnecessary.

This is a trade off between CPU and memory. IMO, memory is rarer resource than 
CPU. Given the profiling didn’t show significant CPU overhead, I’ll keep 
current implementation.

bq. 
CartesianProductEdgeManagerPartitioned#routeInputSourceTaskFailedEventToDestination
 computes the partition and use it to create EventRouteMetadata. It appears it 
isn't necessary to specify the sourceTaskOutputIndex; Edge doesn't use that.
I would say let’s stick to what’s specified in API. Although this can improve 
the performance, it’s derived from system implementation which changes from 
time to time, so it’s not a good idea to depend on this.

> Implement vertex manager and edge manager of cartesian product edge
> -------------------------------------------------------------------
>
>                 Key: TEZ-3230
>                 URL: https://issues.apache.org/jira/browse/TEZ-3230
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Zhiyuan Yang
>            Assignee: Zhiyuan Yang
>         Attachments: TEZ-3230.1.patch, TEZ-3230.2.patch, TEZ-3230.3.patch, 
> TEZ-3230.4.patch, TEZ-3230.5.patch, TEZ-3230.6.patch, TEZ-3230.7.patch, 
> TEZ-3230.8.patch, TEZ-3230.9.patch, TEZ-3230.WIP.1.patch, 
> TEZ-3230.WIP.2.patch, TEZ-3230.WIP.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to