[ https://issues.apache.org/jira/browse/HIVE-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15340815#comment-15340815 ]
Sergey Shelukhin commented on HIVE-14060: ----------------------------------------- Does it happen on FSes other than Azure? The culprit there seems to be AZURE_BLOCK_LOCATION_HOST_DEFAULT in the FS. It may be azure-specific... > Hive: Remove bogus "localhost" from Hive splits > ----------------------------------------------- > > Key: HIVE-14060 > URL: https://issues.apache.org/jira/browse/HIVE-14060 > Project: Hive > Issue Type: Bug > Components: Tez > Affects Versions: 2.1.0, 2.2.0 > Reporter: Gopal V > Assignee: Gopal V > Attachments: HIVE-14060.1.patch > > > On remote filesystems like Azure, GCP and S3, the splits contain a filler > location of "localhost". > This is worse than having no location information at all - on large clusters > yarn waits upto 200[1] seconds for heartbeat from "localhost" before > allocating a container. > To speed up this process, the split affinity provider should scrub the bogus > "localhost" from the locations and allow for the allocation of "*" containers > instead on each heartbeat. > [1] - yarn.scheduler.capacity.node-locality-delay=40 x heartbeat of 5s -- This message was sent by Atlassian JIRA (v6.3.4#6332)