HeartSaVioR commented on a change in pull request #28391:
URL: https://github.com/apache/spark/pull/28391#discussion_r432993788



##########
File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala
##########
@@ -270,7 +271,12 @@ class StreamingDeduplicationSuite extends 
StateStoreMetricsTest {
         .select($"eventTime".cast("long").as[Long])
 
       testStream(result, Append)(
-        StartStream(additionalConfs = Map(flagKey -> flag.toString)),
+        StartStream(additionalConfs = Map(
+          noDataBatchEnableKey -> flag.toString,
+          // set `STREAMING_NO_DATA_PROGRESS_EVENT_INTERVAL` a small value to
+          // report an `empty` progress when no data come.
+          noDataProgressIntervalKey -> "1")

Review comment:
       Let's either use the manual clock or fix the UT like the way it doesn't 
depend on STREAMING_NO_DATA_PROGRESS_EVENT_INTERVAL.
   
   For latter like we can set noDataProgressIntervalKey to 1000000 and remove 
the last assertion, which isn't actually testing the behavior of deduplication. 
Even better, you can still keep the last assertion for only when `flag == 
true`, which has a meaning of verification that state cleanup in empty input 
batch doesn't produce new outputs.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to