This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.1 by this push:
new d4ca766 [SPARK-34154][YARN][FOLLOWUP] Fix flaky
LocalityPlacementStrategySuite test
d4ca766 is described below
commit d4ca76627ea3c72d240b20cd771f2a55cf318ce4
Author: “attilapiros” <[email protected]>
AuthorDate: Fri Jan 29 23:54:40 2021 +0900
[SPARK-34154][YARN][FOLLOWUP] Fix flaky LocalityPlacementStrategySuite test
### What changes were proposed in this pull request?
Fixing the flaky `handle large number of containers and tasks
(SPARK-18750)` by avoiding to use `DNSToSwitchMapping` as in some situation DNS
lookup could be extremely slow.
### Why are the changes needed?
After https://github.com/apache/spark/pull/31363 was merged the flaky
`handle large number of containers and tasks (SPARK-18750)` test failed again
in some other PRs but now we have the exact place where the test is stuck.
It is in the DNS lookup:
```
[info] - handle large number of containers and tasks (SPARK-18750) ***
FAILED *** (30 seconds, 4 milliseconds)
[info] Failed with an exception or a timeout at thread join:
[info]
[info] java.lang.RuntimeException: Timeout at waiting for thread to stop
(its stack trace is added to the exception)
[info] at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
[info] at
java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
[info] at
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
[info] at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
[info] at java.net.InetAddress.getAllByName(InetAddress.java:1193)
[info] at java.net.InetAddress.getAllByName(InetAddress.java:1127)
[info] at java.net.InetAddress.getByName(InetAddress.java:1077)
[info] at
org.apache.hadoop.net.NetUtils.normalizeHostName(NetUtils.java:568)
[info] at
org.apache.hadoop.net.NetUtils.normalizeHostNames(NetUtils.java:585)
[info] at
org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:109)
[info] at
org.apache.spark.deploy.yarn.SparkRackResolver.coreResolve(SparkRackResolver.scala:75)
[info] at
org.apache.spark.deploy.yarn.SparkRackResolver.resolve(SparkRackResolver.scala:66)
[info] at
org.apache.spark.deploy.yarn.LocalityPreferredContainerPlacementStrategy.$anonfun$localityOfRequestedContainers$3(LocalityPreferredContainerPlacementStrategy.scala:142)
[info] at
org.apache.spark.deploy.yarn.LocalityPreferredContainerPlacementStrategy$$Lambda$658/1080992036.apply$mcVI$sp(Unknown
Source)
[info] at
scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
[info] at
org.apache.spark.deploy.yarn.LocalityPreferredContainerPlacementStrategy.localityOfRequestedContainers(LocalityPreferredContainerPlacementStrategy.scala:138)
[info] at
org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.org$apache$spark$deploy$yarn$LocalityPlacementStrategySuite$$runTest(LocalityPlacementStrategySuite.scala:94)
[info] at
org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite$$anon$1.run(LocalityPlacementStrategySuite.scala:40)
[info] at java.lang.Thread.run(Thread.java:748)
(LocalityPlacementStrategySuite.scala:61)
...
```
This could be because of the DNS servers used by those build machines are
not configured to handle IPv6 queries and the client has to wait for the IPv6
query to timeout before falling back to IPv4.
This even make the tests more consistent. As when a single host was given
to lookup via `resolve(hostName: String)` it gave a different answer from
calling `resolve(hostNames: Seq[String])` with a `Seq` containing that single
host.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Unit tests.
Closes #31397 from attilapiros/SPARK-34154-2nd.
Authored-by: “attilapiros” <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit d3f049cbc274ee64bb9b56d6addba4f2cb8f1f0a)
Signed-off-by: HyukjinKwon <[email protected]>
---
.../test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala | 4 ++++
1 file changed, 4 insertions(+)
diff --git
a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
index 825bdd9..9a7eed6 100644
---
a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
+++
b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
@@ -22,6 +22,7 @@ import java.util.Collections
import scala.collection.JavaConverters._
import scala.collection.mutable
+import org.apache.hadoop.net.{Node, NodeBase}
import org.apache.hadoop.yarn.api.records._
import org.apache.hadoop.yarn.client.api.AMRMClient
import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest
@@ -50,6 +51,9 @@ class MockResolver extends
SparkRackResolver(SparkHadoopUtil.get.conf) {
if (hostName == "host3") "/rack2" else "/rack1"
}
+ override def resolve(hostNames: Seq[String]): Seq[Node] =
+ hostNames.map(n => new NodeBase(n, resolve(n)))
+
}
class YarnAllocatorSuite extends SparkFunSuite with Matchers with
BeforeAndAfterEach {
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]