Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/8533#discussion_r48953313
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala
---
@@ -35,16 +37,28 @@ case class ExecutorCacheTaskLocation(override val host:
String, executorId: Stri
/**
* A location on a host.
+ * The host names will be resolved to IP addresses in case
SPARK_LOCAL_HOSTNAME is unset
+ * to fix the task locality levels always being 'ANY' issue mentioned in
SPARK-10149.
*/
private [spark] case class HostTaskLocation(override val host: String)
extends TaskLocation {
- override def toString: String = host
+ override def toString: String = if ( TaskLocation.sparkLocalHostname ==
None ) {
+ InetAddress.getByName(host).getHostAddress
+ } else {
+ host
+ }
}
/**
* A location on a host that is cached by HDFS.
+ * The host names will be resolved to IP addresses in case
SPARK_LOCAL_HOSTNAME is unset
+ * to fix the task locality levels always being 'ANY' issue mentioned in
SPARK-10149.
*/
private [spark] case class HDFSCacheTaskLocation(override val host:
String) extends TaskLocation {
- override def toString: String = TaskLocation.inMemoryLocationTag + host
+ override def toString: String = if ( TaskLocation.sparkLocalHostname ==
None ) {
+ TaskLocation.inMemoryLocationTag +
InetAddress.getByName(host).getHostAddress
--- End diff --
better handle `InetAddress.getByName()` failure conditions here. It is
entirely possible for a host to not know its own name if configured badly
enough.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]