Github user jose-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/20398#discussion_r163950993
--- Diff:
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
---
@@ -1116,6 +1116,7 @@ class KafkaSourceStressForDontFailOnDataLossSuite
extends StreamTest with Shared
}
query.stop()
+ query.awaitTermination()
--- End diff --
This made me realize I misdiagnosed the problem. awaitTermination() can't
actually help here, because the query itself is generating an exception.
The real problem is that we're using a "local[2,3]" test cluster, which
only has 2 cores. Continuous processing requires 1 core per topic partition, so
when the test randomly gets above 2 topics, it will fail to make progress. If
it stays above 2 topics when the test ends, we'll get the observed flakiness.
I'm going to bump the priority of reporting some kind of error in this scenario.
We can keep the awaitTermination and add processAllAvailable, since they
won't hurt.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]