yaooqinn opened a new pull request #28442:
URL: https://github.com/apache/spark/pull/28442


   
   ### What changes were proposed in this pull request?
   The `Kafka*Suite`s are flaky because of the issue of Hadoop MiniKdc - 
https://issues.apache.org/jira/browse/HADOOP-12656
   > Looking at MiniKdc implementation, if port is 0, the constructor use 
ServerSocket to find an unused port, assign the port number to the member 
variable port and close the ServerSocket object; later, in initKDCServer(), 
instantiate a TcpTransport object and bind at that port.
   
   > It appears that the port may be used in between, and then throw the 
exception.
   
   Related test failures are suspected,  such as 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122225/testReport/org.apache.spark.sql.kafka010/KafkaDelegationTokenSuite/_It_is_not_a_test_it_is_a_sbt_testing_SuiteSelector_/
   
   ```scala
   [info] org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite *** ABORTED 
*** (15 seconds, 426 milliseconds)
   [info]   java.net.BindException: Address already in use
   [info]   at sun.nio.ch.Net.bind0(Native Method)
   [info]   at sun.nio.ch.Net.bind(Net.java:433)
   [info]   at sun.nio.ch.Net.bind(Net.java:425)
   [info]   at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   [info]   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   [info]   at 
org.apache.mina.transport.socket.nio.NioSocketAcceptor.open(NioSocketAcceptor.java:198)
   [info]   at 
org.apache.mina.transport.socket.nio.NioSocketAcceptor.open(NioSocketAcceptor.java:51)
   [info]   at 
org.apache.mina.core.polling.AbstractPollingIoAcceptor.registerHandles(AbstractPollingIoAcceptor.java:547)
   [info]   at 
org.apache.mina.core.polling.AbstractPollingIoAcceptor.access$400(AbstractPollingIoAcceptor.java:68)
   [info]   at 
org.apache.mina.core.polling.AbstractPollingIoAcceptor$Acceptor.run(AbstractPollingIoAcceptor.java:422)
   [info]   at 
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
   [info]   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   [info]   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   [info]   at java.lang.Thread.run(Thread.java:748)
   ```
   After comparing the error stack trace with similar issues reported  in 
different projects, such as 
   https://issues.apache.org/jira/browse/KAFKA-3453
   https://issues.apache.org/jira/browse/HBASE-14734
   
   We can be sure that they are caused by the same problem issued in 
HADOOP-12656.
   
   In the PR, We apply the approach from HBASE first before we finally drop 
Hadoop 2.7.x
   
   ### Why are the changes needed?
   
   fix test flakiness
   
   
   ### Does this PR introduce _any_ user-facing change?
   NO
   
   
   ### How was this patch tested?
   
   the test itself passing Jenkins
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to