gianm commented on issue #13936:
URL: https://github.com/apache/druid/issues/13936#issuecomment-1777699171

   From reading the code, as far as I can tell, the state is not persisted, and 
it isn't set back up on supervisor restart until `inactiveAfterMillis` has 
elapsed.
   
   I can think of a couple options for addressing this. One is that we could 
treat `idle` like `suspended`, by putting it in the supervisor spec and 
persisting it. However this seems weird to me, since it would involve the 
supervisor needing to edit its own spec once it enters idle state, and 
typically supervisors don't edit their own specs in the metadata store. So this 
wouldn't be my preferred choice.
   
   Another, IMO better approach is to adjust the logic for supervisor startup 
to get into the idle state immediately if that makes sense. Logic could be 
something like: if committed offsets match the current stream offsets, and no 
tasks are running, go into idle immediately. I think we check the current 
stream offsets once a minute, so that means the main risk is that we'll launch 
tasks up to a minute late in the case where the offsets did match, but then a 
new message comes in right after the supervisor starts up. But that should be 
rare, and if it does happen, it means messages are coming in quite 
infrequently. So it's probably fine to err on the side of staying idle for up 
to a minute.
   
   Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to