[
https://issues.apache.org/jira/browse/HADOOP-9150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HADOOP-9150:
--------------------------------
Attachment: tracing-resolver.tgz
log.txt
To diagnose this, I wrote a wrapper implementation of the NameService SPI which
logs all resolutions. Attached is the source for the tracing implementation
along with a log I captured on a test cluster. Here you can see a DNS lookup
coming from the path canonicalization code:
{code}
java.lang.Exception: looking up ha-nn-uri
at MyNameservice.lookupAllHostAddr(MyNameservice.java:11)
...
at
org.apache.hadoop.security.SecurityUtil$StandardHostResolver.getByName(SecurityUtil.java:538)
at
org.apache.hadoop.security.SecurityUtil.getByName(SecurityUtil.java:526)
at org.apache.hadoop.net.NetUtils.canonicalizeHost(NetUtils.java:283)
at org.apache.hadoop.net.NetUtils.getCanonicalUri(NetUtils.java:255)
at org.apache.hadoop.fs.FileSystem.getCanonicalUri(FileSystem.java:214)
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:524)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:170)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:401)
...
{code}
> Unnecessary DNS resolution attempts for logical URIs
> ----------------------------------------------------
>
> Key: HADOOP-9150
> URL: https://issues.apache.org/jira/browse/HADOOP-9150
> Project: Hadoop Common
> Issue Type: Bug
> Components: ha
> Affects Versions: 3.0.0, 2.0.2-alpha
> Reporter: Todd Lipcon
> Priority: Critical
> Attachments: log.txt, tracing-resolver.tgz
>
>
> In the FileSystem code, we accidentally try to DNS-resolve the logical name
> before it is converted to an actual domain name. In some DNS setups, this can
> cause a big slowdown - eg in one misconfigured cluster we saw a 2-3x drop in
> terasort throughput, since every task wasted a lot of time waiting for slow
> "not found" responses from DNS.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira