[
https://issues.apache.org/jira/browse/SPARK-30541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056820#comment-17056820
]
Gabor Somogyi commented on SPARK-30541:
---------------------------------------
The first problem is obvious, Kafka is not coming up all the time consistently.
I think not much to do there unless Kafka community is fixing the issue. As a
temporary solution retry can be used in the test.
The second problem is also coming from the Kafka side:
{code:java}
[info] Cause: org.apache.kafka.common.KafkaException:
javax.security.auth.login.LoginException: Client not found in Kerberos database
(6) - Client not found in Kerberos database
{code}
When I've reproduced the issue locally I've realised that:
* KDC didn't throw any exception while the mentioned user created
* The keytab file is readable and able to do kinit with it
Maybe it's another flaky behaviour on the Kafka side?!
All in all since the broker is flaky and KafkaAdminClient shown also some
flakyness my suggestion is to use testRetry until the mentioned problems are
not solved in Kafka.
> Flaky test: org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite
> -------------------------------------------------------------------
>
> Key: SPARK-30541
> URL: https://issues.apache.org/jira/browse/SPARK-30541
> Project: Spark
> Issue Type: Bug
> Components: SQL, Structured Streaming
> Affects Versions: 3.0.0
> Reporter: Jungtaek Lim
> Priority: Blocker
> Attachments: consoleText_NOK.txt, consoleText_OK.txt,
> unit-tests_NOK.log, unit-tests_OK.log
>
>
> The test suite has been failing intermittently as of now:
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116862/testReport/]
>
> org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite.(It is not a test it
> is a sbt.testing.SuiteSelector)
>
> {noformat}
> Error Details
> org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to
> eventually never returned normally. Attempted 3939 times over
> 1.0001223535333332 minutes. Last failure message: KeeperErrorCode =
> AuthFailed for /brokers/ids.
> Stack Trace
> sbt.ForkMain$ForkError:
> org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to
> eventually never returned normally. Attempted 3939 times over
> 1.0001223535333332 minutes. Last failure message: KeeperErrorCode =
> AuthFailed for /brokers/ids.
> at
> org.scalatest.concurrent.Eventually.tryTryAgain$1(Eventually.scala:432)
> at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:439)
> at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:391)
> at org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:479)
> at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:337)
> at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:336)
> at org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:479)
> at
> org.apache.spark.sql.kafka010.KafkaTestUtils.setup(KafkaTestUtils.scala:292)
> at
> org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite.beforeAll(KafkaDelegationTokenSuite.scala:49)
> at
> org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
> at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
> at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
> at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:58)
> at
> org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:317)
> at
> org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:510)
> at sbt.ForkMain$Run$2.call(ForkMain.java:296)
> at sbt.ForkMain$Run$2.call(ForkMain.java:286)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: sbt.ForkMain$ForkError:
> org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode =
> AuthFailed for /brokers/ids
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:130)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
> at
> kafka.zookeeper.AsyncResponse.resultException(ZooKeeperClient.scala:554)
> at kafka.zk.KafkaZkClient.getChildren(KafkaZkClient.scala:719)
> at kafka.zk.KafkaZkClient.getSortedBrokerList(KafkaZkClient.scala:455)
> at
> kafka.zk.KafkaZkClient.getAllBrokersInCluster(KafkaZkClient.scala:404)
> at
> org.apache.spark.sql.kafka010.KafkaTestUtils.$anonfun$setup$3(KafkaTestUtils.scala:293)
> at
> org.scalatest.concurrent.Eventually.makeAValiantAttempt$1(Eventually.scala:395)
> at
> org.scalatest.concurrent.Eventually.tryTryAgain$1(Eventually.scala:409)
> ... 20 more
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]