GitHub user wulei-bj-cn opened a pull request:
https://github.com/apache/spark/pull/8533
Function localHostName() is trying to fetch the hostname for each of â¦
â¦the hosts, yet when "SPARK_LOCAL_HOSTNAME" is not set, i.e.
customHostname is null, this function will try to fetch the IP addresses for
the hosts. That's because localIpAddress.getHostAddress is called, which will
fetch the IP addresses in case customHostname is null. However, the returned IP
addresses (1.2.3.4) will not match the hostnames (host1) that are fetched from
DFS file systems. Hence locality level will always be 'ANY' and lots of network
I/O is introduced when input files are read from DFS file systems. Therefore,
to make function return real hostnames when "SPARK_LOCAL_HOSTNAME" is not set,
localIpAddress.getHostAddress is replaced by localIpAddress.getHostName, which
will return a real hostname.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wulei-bj-cn/spark lei-branch
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8533.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8533
----
commit 807524fdb89ac37bb84a73c09891cc1298f0ba84
Author: Lei Wu <[email protected]>
Date: 2015-08-31T07:02:34Z
Function localHostName() is trying to fetch the hostname for each of the
hosts, yet when "SPARK_LOCAL_HOSTNAME" is not set, i.e. customHostname is null,
this function will try to fetch the IP addresses for the hosts. That's because
localIpAddress.getHostAddress is called, which will fetch the IP addresses in
case customHostname is null. However, the returned IP addresses (1.2.3.4) will
not match the hostnames (host1) that are fetched from DFS file systems. Hence
locality level will always be 'ANY' and lots of network I/O is introduced when
input files are read from DFS file systems. Therefore, to make function return
real hostnames when "SPARK_LOCAL_HOSTNAME" is not set,
localIpAddress.getHostAddress is replaced by localIpAddress.getHostName, which
will return a real hostname.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]