uncleGen commented on issue #24283: [SPARK-27355][SS] Make query execution more 
sensitive to epoch message late or lost
URL: https://github.com/apache/spark/pull/24283#issuecomment-480701061
 
 
   @gaborgsomogyi Thanks for your reply.  
[#23156](https://github.com/apache/spark/pull/20936) introduced a maximum queue 
threshold before stop the stream with a error. In 
[#23156](https://github.com/apache/spark/pull/20936) , we used the same 
threshold for different queue, i.e. `partitionCommits`, `partitionOffsets` and 
`epochsWaitingToBeCommitted`. Generally, the size of `partitionCommits` and 
`partitionOffsets` grow much faster than `epochsWaitingToBeCommitted`. The 
stream may fail with 10 epochs if partition number is 100. However, we may wait 
for 10000 epochs before failure if partition number is 1 (if i understand 
correctly). Well, this is a harsh boundary condition. The main concern of PR is 
to split there two thresholds to make query execution more sensitive to epoch 
message late or lost. If you feel like `10 epoch` is too sensitive in some  
intermittent problem, we can relax this condition to 100 or other.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to