[GitHub] [spark] HyukjinKwon commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox


HyukjinKwon commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r421174937



##
File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
##
@@ -131,11 +130,7 @@ class KafkaTestUtils(
   }
 
   private def setUpMiniKdc(): Unit = {
-val kdcDir = Utils.createTempDir()

Review comment:
   Let's make a helper case when we need to duplicate this one more next 
time. Two times should be fine with or without it :-).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-05 Thread GitBox


HyukjinKwon commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r420536516



##
File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
##
@@ -131,11 +130,7 @@ class KafkaTestUtils(
   }
 
   private def setUpMiniKdc(): Unit = {
-val kdcDir = Utils.createTempDir()

Review comment:
   Yeah, can we just use `eventually` instead of having another class? This 
is in the guide, see also 
https://github.com/databricks/scala-style-guide#misc_well_tested_method





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-04 Thread GitBox


HyukjinKwon commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r419350862



##
File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
##
@@ -131,11 +130,7 @@ class KafkaTestUtils(
   }
 
   private def setUpMiniKdc(): Unit = {
-val kdcDir = Utils.createTempDir()

Review comment:
   Can we simply use `eventually`? e.g.:
   
   ```scala
   eventually(timeout(10.seconds), interval(1.seconds)) {
 ...
   }
   ```
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-04 Thread GitBox


HyukjinKwon commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r419331470



##
File path: core/src/test/scala/org/apache/spark/MiniKDCHelper.scala
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import java.io.File
+import java.net.BindException
+
+import scala.util.control.NonFatal
+
+import org.apache.hadoop.minikdc.MiniKdc
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.util.Utils
+
+trait MiniKDCHelper extends Logging {
+
+  def startMiniKdc(): MiniKdc = {
+var kdc: MiniKdc = null
+val kdcConf = MiniKdc.createConf()
+kdcConf.setProperty(MiniKdc.DEBUG, "true")
+var bindException = false
+var kdcDir: File = null
+// The port for MiniKdc service gets selected in the constructor, but will 
be bound
+// to it later in MiniKdc.start() -> MiniKdc.initKDCServer() -> 
KdcServer.start().
+// In meantime, when some other service might capture the port during this 
progress, and
+// cause BindException.
+// This makes our tests which have dedicated JVMs and rely on MiniKDC 
being flaky
+//
+// https://issues.apache.org/jira/browse/HADOOP-12656 get fixed in Hadoop 
2.8.0.
+// The workaround here is to capture the exception and retry, since we are 
using Hadoop 2.7.4
+// as default.
+// https://issues.apache.org/jira/browse/SPARK-31631
+var numRetries = 1
+do {
+  try {
+bindException = false
+kdcDir = Utils.createTempDir()
+kdc = new MiniKdc(kdcConf, kdcDir)
+kdc.start()

Review comment:
   Okay, `KafkaDelegationTokenSuite` runs in a separate JVM. ^ way won't 
work then.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-04 Thread GitBox


HyukjinKwon commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r419331470



##
File path: core/src/test/scala/org/apache/spark/MiniKDCHelper.scala
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import java.io.File
+import java.net.BindException
+
+import scala.util.control.NonFatal
+
+import org.apache.hadoop.minikdc.MiniKdc
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.util.Utils
+
+trait MiniKDCHelper extends Logging {
+
+  def startMiniKdc(): MiniKdc = {
+var kdc: MiniKdc = null
+val kdcConf = MiniKdc.createConf()
+kdcConf.setProperty(MiniKdc.DEBUG, "true")
+var bindException = false
+var kdcDir: File = null
+// The port for MiniKdc service gets selected in the constructor, but will 
be bound
+// to it later in MiniKdc.start() -> MiniKdc.initKDCServer() -> 
KdcServer.start().
+// In meantime, when some other service might capture the port during this 
progress, and
+// cause BindException.
+// This makes our tests which have dedicated JVMs and rely on MiniKDC 
being flaky
+//
+// https://issues.apache.org/jira/browse/HADOOP-12656 get fixed in Hadoop 
2.8.0.
+// The workaround here is to capture the exception and retry, since we are 
using Hadoop 2.7.4
+// as default.
+// https://issues.apache.org/jira/browse/SPARK-31631
+var numRetries = 1
+do {
+  try {
+bindException = false
+kdcDir = Utils.createTempDir()
+kdc = new MiniKdc(kdcConf, kdcDir)
+kdc.start()

Review comment:
   Okay, `KafkaDelegationTokenSuite` is running in a separate JVM. ^ way 
won't work then.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-04 Thread GitBox


HyukjinKwon commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r419327465



##
File path: core/src/test/scala/org/apache/spark/MiniKDCHelper.scala
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import java.io.File
+import java.net.BindException
+
+import scala.util.control.NonFatal
+
+import org.apache.hadoop.minikdc.MiniKdc
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.util.Utils
+
+trait MiniKDCHelper extends Logging {
+
+  def startMiniKdc(): MiniKdc = {
+var kdc: MiniKdc = null
+val kdcConf = MiniKdc.createConf()
+kdcConf.setProperty(MiniKdc.DEBUG, "true")
+var bindException = false
+var kdcDir: File = null
+// The port for MiniKdc service gets selected in the constructor, but will 
be bound
+// to it later in MiniKdc.start() -> MiniKdc.initKDCServer() -> 
KdcServer.start().
+// In meantime, when some other service might capture the port during this 
progress, and
+// cause BindException.
+// This makes our tests which have dedicated JVMs and rely on MiniKDC 
being flaky
+//
+// https://issues.apache.org/jira/browse/HADOOP-12656 get fixed in Hadoop 
2.8.0.
+// The workaround here is to capture the exception and retry, since we are 
using Hadoop 2.7.4
+// as default.
+// https://issues.apache.org/jira/browse/SPARK-31631
+var numRetries = 1
+do {
+  try {
+bindException = false
+kdcDir = Utils.createTempDir()
+kdc = new MiniKdc(kdcConf, kdcDir)
+kdc.start()

Review comment:
   Can't we just have a global static lock for this two lines? The problem 
seems it allocates the new port, and it's actually used later when it 
`start()`. And, another instance can happen in-between which ends up with 
"address in use".
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org