This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new cc78282 [SPARK-34154][YARN][FOLLOWUP] Fix flaky
LocalityPlacementStrategySuite test
cc78282 is described below
commit cc782829b7e054a8750912d3a96cf034a7ba081a
Author: “attilapiros” <[email protected]>
AuthorDate: Fri Jan 29 23:54:40 2021 +0900
[SPARK-34154][YARN][FOLLOWUP] Fix flaky LocalityPlacementStrategySuite test
### What changes were proposed in this pull request?
Fixing the flaky `handle large number of containers and tasks
(SPARK-18750)` by avoiding to use `DNSToSwitchMapping` as in some situation DNS
lookup could be extremely slow.
### Why are the changes needed?
After https://github.com/apache/spark/pull/31363 was merged the flaky
`handle large number of containers and tasks (SPARK-18750)` test failed again
in some other PRs but now we have the exact place where the test is stuck.
It is in the DNS lookup:
```
[info] - handle large number of containers and tasks (SPARK-18750) ***
FAILED *** (30 seconds, 4 milliseconds)
[info] Failed with an exception or a timeout at thread join:
[info]
[info] java.lang.RuntimeException: Timeout at waiting for thread to stop
(its stack trace is added to the exception)
[info] at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
[info] at
java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
[info] at
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
[info] at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
[info] at java.net.InetAddress.getAllByName(InetAddress.java:1193)
[info] at java.net.InetAddress.getAllByName(InetAddress.java:1127)
[info] at java.net.InetAddress.getByName(InetAddress.java:1077)
[info] at
org.apache.hadoop.net.NetUtils.normalizeHostName(NetUtils.java:568)
[info] at
org.apache.hadoop.net.NetUtils.normalizeHostNames(NetUtils.java:585)
[info] at
org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:109)
[info] at
org.apache.spark.deploy.yarn.SparkRackResolver.coreResolve(SparkRackResolver.scala:75)
[info] at
org.apache.spark.deploy.yarn.SparkRackResolver.resolve(SparkRackResolver.scala:66)
[info] at
org.apache.spark.deploy.yarn.LocalityPreferredContainerPlacementStrategy.$anonfun$localityOfRequestedContainers$3(LocalityPreferredContainerPlacementStrategy.scala:142)
[info] at
org.apache.spark.deploy.yarn.LocalityPreferredContainerPlacementStrategy$$Lambda$658/1080992036.apply$mcVI$sp(Unknown
Source)
[info] at
scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
[info] at
org.apache.spark.deploy.yarn.LocalityPreferredContainerPlacementStrategy.localityOfRequestedContainers(LocalityPreferredContainerPlacementStrategy.scala:138)
[info] at
org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.org$apache$spark$deploy$yarn$LocalityPlacementStrategySuite$$runTest(LocalityPlacementStrategySuite.scala:94)
[info] at
org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite$$anon$1.run(LocalityPlacementStrategySuite.scala:40)
[info] at java.lang.Thread.run(Thread.java:748)
(LocalityPlacementStrategySuite.scala:61)
...
```
This could be because of the DNS servers used by those build machines are
not configured to handle IPv6 queries and the client has to wait for the IPv6
query to timeout before falling back to IPv4.
This even make the tests more consistent. As when a single host was given
to lookup via `resolve(hostName: String)` it gave a different answer from
calling `resolve(hostNames: Seq[String])` with a `Seq` containing that single
host.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Unit tests.
Closes #31397 from attilapiros/SPARK-34154-2nd.
Authored-by: “attilapiros” <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit d3f049cbc274ee64bb9b56d6addba4f2cb8f1f0a)
Signed-off-by: HyukjinKwon <[email protected]>
---
.../test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala | 4 ++++
1 file changed, 4 insertions(+)
diff --git
a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
index 6216d47..0c40c98 100644
---
a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
+++
b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
@@ -21,6 +21,7 @@ import java.util.Collections
import scala.collection.JavaConverters._
+import org.apache.hadoop.net.{Node, NodeBase}
import org.apache.hadoop.yarn.api.records._
import org.apache.hadoop.yarn.client.api.AMRMClient
import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest
@@ -47,6 +48,9 @@ class MockResolver extends
SparkRackResolver(SparkHadoopUtil.get.conf) {
if (hostName == "host3") "/rack2" else "/rack1"
}
+ override def resolve(hostNames: Seq[String]): Seq[Node] =
+ hostNames.map(n => new NodeBase(n, resolve(n)))
+
}
class YarnAllocatorSuite extends SparkFunSuite with Matchers with
BeforeAndAfterEach {
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]