Re: [PR] Do not kill a task if offsets are inconsistent but publish from another group is pending (druid)

via GitHub Sun, 08 Mar 2026 01:08:55 -0800


kfaraz commented on PR #19091:
URL: https://github.com/apache/druid/pull/19091#issuecomment-4018690224


   #### [1]
   
   > are they broadly applicable when tasks roll over with task replicas 
enabled?
   
   Yes, @abhishekrb19 , the `"Inconsistency between stored metadata"` error can 
sometimes occur when two replicas try to publish at around the same time. The 
first replica is able to publish segments successfully, while the other one 
would fail to publish. However, this is not a retryable failure and the slow 
replica just ends up marking the publish as success.
   
   
https://github.com/apache/druid/blob/8307d8a26c6b7286a8a57b85940a7f29d3d43248/server/src/main/java/org/apache/druid/segment/realtime/appenderator/BaseAppenderatorDriver.java#L670-L673
   
   I assume you would be seeing more errors of the format below:
   ```
   "The new start metadata state[%s] is ahead of the last committed end 
state[%s]. Try resetting the supervisor."
   ```
   These are retryable errors which typically occur when publish is slow.
   Please let me know if that is not the case.
   
   #### [2]
   
   > Is it worth considering making this property configurable rather than 
hardcoding it to 13 in this patch? Alternatively, maybe it could be derived 
dynamically as a function of completionTimeout?
   
   I am not really a fan of the current retry mechanism , since it requires the 
task to blindly retry without knowing how long it should wait. Exponential 
backoff retries make more sense when the issue is transient (and likely 
self-resolving). Slow publish, on the other hand, is somewhat 
predictable/repeatable behaviour and we should probably be able to incorporate 
that in the retry timeout.
   
   I don't want to make the number of retries configurable just yet. Instead, 
tying it to `completionTimeout` probably makes more sense. On top of that, I 
think the task could submit an "async" task action which kind of works like a 
long poll. The supervisor could keep a queue of pending publish actions, and as 
soon as this one finishes, we send back a success response. The 
`completionTimeout` (or some function of it) would then simply be used as the 
request timeout for this async action. Thus, no retries needed at all.
   
   I want to merge the current patch and verify if the 
`isAnotherTaskGroupPublishing()` method is effective in identifying if there 
are prior pending publish actions. If it works as expected, I think we can 
build upon it further to get the queue semantics on the supervisor.
   
   @gianm , @jtuglu1 , what are your thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Do not kill a task if offsets are inconsistent but publish from another group is pending (druid)

Reply via email to