HeartSaVioR opened a new pull request, #37213:
URL: https://github.com/apache/spark/pull/37213

   ### What changes were proposed in this pull request?
   
   This PR proposes to deprecate Trigger.Once and suggest Trigger.AvailableNow 
as a replacement.
   
   This PR also tries to replace Trigger.Once to Trigger.AvailableNow in the 
test code as well, except the cases Trigger.Once is used intentionally.
   
   ### Why are the changes needed?
   
   Trigger.Once() exposes various issues, including:
   
   1) weak guarantee of the contract
   
   This is the javadoc content of `Trigger.Once`:
   
   > A trigger that processes all available data in a single batch then 
terminates the query.
   
   Spark does not respect the contract when there is "uncommitted" batch in the 
previous run. It really works as the name represents, "just run a single 
batch", hence if there is "uncommitted" batch, Spark will execute the 
"uncommitted" batch and terminate without processing new data.
   
   2) scalable issue on batch
   
   This is the main rationalization we introduced Trigger.AvailableNow.
   
   3) huge output latency for stateful operator due to the lack of no-data batch
   
   Since Trigger.Once executes the single batch and terminates, the processing 
for watermark advancement is deferred to the next execution of the query, which 
tends to be multiple hours or even day(s).
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, end users will start to see the deprecation message when they use 
Trigger.Once. The deprecation message guides the end users to migrate to 
Trigger.Available, with the rationalization on migration.
   
   ### How was this patch tested?
   
   Existing UTs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to