tdas commented on a change in pull request #28040: [SPARK-31278][SS] Fix 
StreamingQuery output rows metric
URL: https://github.com/apache/spark/pull/28040#discussion_r398888067
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala
 ##########
 @@ -170,9 +170,8 @@ trait ProgressReporter extends Logging {
       )
     }
 
-    val sinkProgress = SinkProgress(
-      sink.toString,
-      sinkCommitProgress.map(_.numOutputRows))
+    val sinkOutput = if (hasExecuted) sinkCommitProgress.map(_.numOutputRows) 
else Some(0L)
 
 Review comment:
   its not clear to me why this needs to be done. MicroBatchExecution.runBatch 
sets this `sinkCommitProgress` variable after every batch, even if the batch is 
empty. So I would expect this progress to have update metrics, which would 
naturally be 0 for an empty batch that did not produce anything.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to