[ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375682#comment-15375682
 ] 

Siddharth Seth commented on TEZ-3274:
-------------------------------------

Haven't looked at this in sometime. Is this being used with 
MRInputSplitDistributor, and the initial parallelism set on the specific 
vertex. I don't think using a Root Input along with a ShuffleInput on the same 
vertex will work with MRInputAMSplitGenerator since parallelism is setup at 
runtime. Shuffle tasks will see a value of -1 if the initialization takes time.

I believe we never really focused on this case, and if it showed up - it would 
need to be handled via a custom VertexManager. If such a manager were to exist 
- how would the data distribution be handled? There's different splits for the 
MRInput and partitions on the Shuffle side - how are they mapped?


> Vertex with MRInput and shuffle input does not respect slow start
> -----------------------------------------------------------------
>
>                 Key: TEZ-3274
>                 URL: https://issues.apache.org/jira/browse/TEZ-3274
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to