This is an automated email from the ASF dual-hosted git repository.
kunalkapoor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git
The following commit(s) were added to refs/heads/master by this push:
new e85254d [CARBONDATA-3426] Fix load performance by fixing task
distribution issue
e85254d is described below
commit e85254d84bc94cf371950698306672db538d970e
Author: ajantha-bhat <[email protected]>
AuthorDate: Mon Jun 10 15:52:31 2019 +0530
[CARBONDATA-3426] Fix load performance by fixing task distribution issue
Problem: Consider 3 node cluster (host name a,b,c with IP1, IP2, IP3 as ip
address),
to launch load task, host name is required from NewCarbonDataLoadRDD in
getPreferredLocations().
But if the driver is 'a' (IP1),
Result is IP1, b,c instead of a,b,c. Hence task was not launching to one
executor which is same ip as driver.
Solution: Revert the change in getLocalhostIPs as it is not used in any
other flow.
This closes #3276
---
.../src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
a/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
b/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
index 4256777..ca35adc 100644
---
a/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
+++
b/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
@@ -103,7 +103,7 @@ object DistributionUtil {
while (iface.hasMoreElements) {
addresses = iface.nextElement().getInterfaceAddresses.asScala.toList ++
addresses
}
- val inets = addresses.map(_.getAddress.getHostName)
+ val inets = addresses.map(_.getAddress.getHostAddress)
inets
}