GrigorievNick commented on pull request #27380:
URL: https://github.com/apache/spark/pull/27380#issuecomment-805592781


   Hi,
   I know that these changes already in spark 3.
   But I have a question.
   How can I configure backpressure to my job when I want to use TriggerOnce?
   In spark 2.4 I have a use case, to backfill some data and then start the 
stream.
   So I use trigger once, but my backfill scenario can be very very big and 
sometimes create too big a load on my disks because of shuffles and to driver 
memory because FileIndex cached there.
   SO I use max `maxOffsetsPerTrigger` and `maxFilesPerTrigger` to control how 
much data my spark can process. that's how I configure backpressure. 
   
   And now you remove this ability, so assume you can suggest a better way to 
go?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to