capistrant opened a new issue #10606: URL: https://github.com/apache/druid/issues/10606
### Motivation Loading primary replicants for Druid Segments is one of the most important things that the Coordinator does. Without a primary replicant available on the cluster, a segment is not available for querying. The Coordinator performs primary replicant loading within a set of Coordinator duties that relate to Historical Management. This grouping can result in the coordinator spending a lot of time doing other things such as loading non-primary replicants, balancing segments, etc. A side effect of this waiting for other Coordinator jobs to complete before more primary replicants can be loaded is that data stays unavailable for longer than it otherwise might have to. This can be a negative end user experience. Breaking primary replicant loading out into its own scheduled runnable group can guarantee that primary replicants are loaded more regularly. ### Proposed changes I am proposing an optional new `DutiesRunnable` in the `DruidCoordinator`. Operators can choose whether or not to break primary replicant loading out into its own `DutiesRunnable`. If they choose not to enable the dedicated primary replicant loading, their coordinator will function just as it always has. If they choose to enable the dedicated primary replicant loading, their coordinator will add a scheduled `DutiesRunnable` dedicated to executing matching `LoadRule` for segments and only doing the primary replicant load for that `LoadRule` when ran. The `HistoricalManagement` `DutiesRunnable` will continue all other `HistoricalManagement` duties including performing non-primary replicant loading and replicant dropping while executing a matched `LoadRule` for a segment. My POC implementation for the proposal exposes two new Coordinator runtime configurations for operators: `druid.coordinator.loadPrimaryReplicantSeparately` and `druid.coordinator.period.primaryReplicantLoaderPeriod`. If they choose to enable the first, then a scheduled executor with a configurable backoff period is configured for loading primary replicants. The new `DutiesRunnable` would have consist of two duties, `UpdateCoordinatorStateAndPrepareCluster` and `RunRules`. * There is an open TODO on analyzing the negative effects of having two `DutiesRunnable` with `UpdateCoordinatorStateAndPrepareCluster`. It is possible that only one of the two should execute the full thing and the other should run a scaled down duty. `RunRules` and `LoadRule` will need a mode associated with them. Now we will be executing `RunRules` in one of two modes. One mode is to only execute `LoadRule` rules that match. The other is to run all matched `Rule`. `LoadRule` is similar, for the primary replicant load, it should run in a mode where it only loads a primary replicant. There also needs to be a mode for skipping primary replicant load. And then lastly, a mode for running all of `LoadRule` and not worrying about replicant types. ### Rationale I think the biggest benefit here is more control for the operator to ensure that primary replicant loading is running as often as needed. In the case of large clusters who do lots of balancing, and non-primary replicant loading due to servers coming in and out of the cluster, primary replicant loading can get blocked often enough that users are asking about why their new segments aren't becoming available in a timely manner after batch indexing finishes. As for alternative approaches, I have not thought of any similar ways to achieve this elevated priority for loading primary replicants at this time. I am definitely open to suggestions though. ### Operational impact This section should describe how the proposed changes will impact the operation of existing clusters. It should answer questions such as: - Is anything going to be deprecated or removed by this change? How will we phase out old behavior? * N/A - Is there a migration path that cluster operators need to be aware of? * Enabling this requires coordinator config changes and a restart. - Will there be any effect on the ability to do a rolling upgrade, or to do a rolling _downgrade_ if an operator wants to switch back to a previous version? * rolling upgrade to the first version that includes this would not require any changes because not adding the configs will leave the coordinator as is. An operator can enable after upgrade if they so choose. * Downgrading should not have any impact. The configs, even if specified by operator would be ignored and coordinator would go back to how it operated before there was a dedicated primary replicant loader. ### Test plan (optional) TBD ### Future work (optional) TBD ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
