[ 
https://issues.apache.org/jira/browse/TEZ-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091851#comment-14091851
 ] 

Bikas Saha commented on TEZ-1396:
---------------------------------

This is not something thats always desirable. In a busy cluster, when a data 
set is hot then there are equally good reasons to spread different consumers 
around to avoid hot spots. The intent of this jira mainly helps cases where 
there is some active service trying to cache data.

> Grouping should generate consistent groups when given the same set of Splits
> ----------------------------------------------------------------------------
>
>                 Key: TEZ-1396
>                 URL: https://issues.apache.org/jira/browse/TEZ-1396
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>
> Currently, it seems like Grouping can end up generating a different set of 
> groups on different invocations of the same set of splits and target tasks.
> The order likely gets affected by the randomization in the block location 
> report from HDFS.
> This should be consistent for better cache utilization.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to