HeartSaVioR commented on a change in pull request #25987: [SPARK-29314][SS]
Don't overwrite the metric "updated" of state operator to 0 if empty batch is
run
URL: https://github.com/apache/spark/pull/25987#discussion_r405222067
##########
File path:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala
##########
@@ -795,7 +795,7 @@ class FlatMapGroupsWithStateSuite extends
StateStoreMetricsTest {
}
},
CheckNewAnswer(("c", "-1")),
- assertNumStateRows(total = 0, updated = 0)
+ assertNumStateRows(total = 0, updated = 1)
Review comment:
Actually the change we encountered the issue is small, as the bug is
affecting the metric only when no data batch updates the state row. This is not
occurred in non-arbitrary stateful operations, as possible state update is only
eviction here, and state row eviction is not counted as updates.
This can be occurred in arbitrary stateful operations as timer can be
triggered for no data batch and the query can update/delete state row which
would trigger update count. This test is verifying one of these cases (delete
state), hence I didn't add the new test. If we would like to have another test
for state func to update the state row on timeout I can do, but most of things
would be redundant.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]