FrankChen021 commented on PR #18591: URL: https://github.com/apache/druid/pull/18591#issuecomment-3374918699
> > I mean risky because overlord needs to restore all tasks, previously we had some problems (maybe bug) that after switching leaders, overlord failed to elect a new leader. > > Yes, there might be some bugs around that. Also, the K8s task runner makes certain list pod calls, which are pretty heavy and needs to be addressed. I think @capistrant is doing some work to improve that code flow. > > > We try our best not to restart coordinator/overlord in production. > > Oh, how frequently do you upgrade your cluster? Is changing the task capacity going to be much more frequent than that. > We don't upgrade clusters very frequently, may be once a year or more than 1 year. We do adjust the capacity (upsize or downsize) regularly based on load/requirement. > I agree that K8s task runner is buggy and we should improve upon it. But making the task capacity dynamic doesn't seem like the best solution. It will open a whole another can of worms and make this piece only more complicated. > > Instead, we should trying to fix up the actual problems in the task runner which make Overlord leader switch erroneous. > > What are your thoughts, @FrankChen021 ? The main idea of dynamic configuration is not to circumvent problems at restarting phase, it's about reducing the operation complexity and saving time. even restarting overlord is smooth, I don't think changing such configuration requires a restart from users/operators' view. for static configurations, operators have to change configuration files, sync files to kubenetes, restarting components, it's a heavy work flow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
