juliuszsompolski opened a new pull request, #42889:
URL: https://github.com/apache/spark/pull/42889

   ### What changes were proposed in this pull request?
   
   In the situation before, query will only be FINISHED when all results have 
been pushed into the output buffers (not necessarily received by client, but 
pushed out of the server).
   
   For LocalTableScanExec, post FINISHED before sending result batches, because 
nothing is executed, only cached local results are returned. For regular 
execution, post FINISHED after all task results have been returned from Spark, 
not after they have been processed and sent out.
   
   ### Why are the changes needed?
   
   Currently, even if a query finished running in Spark, it keeps being RUNNING 
until all results are sent. Then there is a very small difference between 
FINISHED and CLOSED. This change makes it behave more similar to e.g. 
Thriftserver.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. Queries will be posted as FINISHED when they finish executing, not when 
they finish sending results.
   
   ### How was this patch tested?
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to