Github user HeartSaVioR commented on a diff in the pull request:
https://github.com/apache/spark/pull/21617#discussion_r197986093
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala ---
@@ -48,12 +49,13 @@ class StateOperatorProgress private[sql](
def prettyJson: String = pretty(render(jsonValue))
private[sql] def copy(newNumRowsUpdated: Long): StateOperatorProgress =
- new StateOperatorProgress(numRowsTotal, newNumRowsUpdated,
memoryUsedBytes)
+ new StateOperatorProgress(numRowsTotal, newNumRowsUpdated,
memoryUsedBytes, numLateInputRows)
private[sql] def jsonValue: JValue = {
("numRowsTotal" -> JInt(numRowsTotal)) ~
("numRowsUpdated" -> JInt(numRowsUpdated)) ~
- ("memoryUsedBytes" -> JInt(memoryUsedBytes))
+ ("memoryUsedBytes" -> JInt(memoryUsedBytes)) ~
+ ("numLateInputRows" -> JInt(numLateInputRows))
--- End diff --
@arunmahadevan Ah yes got it. If we would want to have accurate number we
need to filter out late events from the first time anyway. I guess we may need
to defer addressing this until we change the behavior.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]