HeartSaVioR commented on a change in pull request #25987: [SPARK-29314][SS] 
Don't overwrite the metric "updated" of state operator to 0 if empty batch is 
run
URL: https://github.com/apache/spark/pull/25987#discussion_r405222067
 
 

 ##########
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala
 ##########
 @@ -795,7 +795,7 @@ class FlatMapGroupsWithStateSuite extends 
StateStoreMetricsTest {
         }
       },
       CheckNewAnswer(("c", "-1")),
-      assertNumStateRows(total = 0, updated = 0)
+      assertNumStateRows(total = 0, updated = 1)
 
 Review comment:
   Actually the change we encountered the issue is small, as the bug is 
affecting the metric only when no data batch updates the state row. This is not 
occurred in non-arbitrary stateful operations, as possible state update is only 
eviction here, and state row eviction is not counted as updates.
   
   This can be occurred in arbitrary stateful operations as timer can be 
triggered for no data batch and the query can update/delete state row which 
would trigger update count. This test is verifying one of these cases (delete 
state), hence I didn't add the new test. If we would like to have another test 
for state func to update the state row on timeout I can do, but most of things 
would be redundant.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to