[ 
https://issues.apache.org/jira/browse/HADOOP-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312643#comment-15312643
 ] 

Daniel Weeks commented on HADOOP-12878:
---------------------------------------

There are a couple problems with this approach:

1) Impersonating hosts can actually cause scheduling delays for tasks because 
the scheduler will try to place on the impersonated host for locality and delay 
placing it on other hosts.  Because there is no locality, the delay is 
unnecessary and can have a significant impact if you have lots of tasks reading 
from S3 on a busy cluster.  Even if you pick a random node (or set of nodes), 
there's no guarantee they will have available capacity and will incur an 
artificial delay.
2) This also messes with tracking locality metrics as some tasks will show up 
as node/rack local, but really aren't.

It might be better to return the address of the S3 endpoint so that it appears 
off cluster at which point the scheduler will not be able to find any locality 
and simply schedule on the first available node.  I'm not sure if you actually 
need to have a node in the cluster for the block location, though.

> Impersonate hosts in s3a for better data locality handling
> ----------------------------------------------------------
>
>                 Key: HADOOP-12878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12878
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Thomas Demoor
>            Assignee: Thomas Demoor
>
> Currently, {{localhost}} is passed as locality for each block, causing all 
> blocks involved in job to initially target the same node (RM), before being 
> moved by the scheduler (to a rack-local node). This reduces parallelism for 
> jobs (with short-lived mappers). 
> We should mimic Azures implementation: a config setting 
> {{fs.s3a.block.location.impersonatedhost}} where the user can enter the list 
> of hostnames in the cluster to return to {{getFileBlockLocations}}. 
> Possible optimization: for larger systems, it might be better to return N 
> (5?) random hostnames to prevent passing a huge array (the downstream code 
> assumes size = O(3)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to