[
https://issues.apache.org/jira/browse/TEZ-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091855#comment-14091855
]
Bikas Saha commented on TEZ-1397:
---------------------------------
In my experience, in a busy cluster with many jobs running concurrently, the
buffer cache is a big cause of performance variation. So again, like TEZ-1396
the desired behavior may actually be undesirable in cases because it creates
hot spots in a busy cluster. This approach makes sense in the case of a service
actively managing a cache.
> Node affinity for tasks processing the same splits
> --------------------------------------------------
>
> Key: TEZ-1397
> URL: https://issues.apache.org/jira/browse/TEZ-1397
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
>
> Within a session, if the same set of HDFS blocks are accessed by different
> tasks - these should ideally be launched on the same node for better buffer
> cache, etc utilization.
> This will likely end up being another level of requests higher up than
> NODE_LOCAL for the scheduler.
--
This message was sent by Atlassian JIRA
(v6.2#6252)