kroeders opened a new issue #10237: URL: https://github.com/apache/druid/issues/10237
### Motivation We have a deployment with a number of different “types” of data source, each with its own approach to retention rules. For example, one group of servers could be configured to load a recent queryable dataset whereas another group of servers is configured to keep an archive based on policies for when data can be deleted. Adding a new data source of a known type involves manually copying the retention rules from an existing data source, which is time consuming and error prone. Similarly, changing the rules across an entire type of data source (for example, advancing a time period on a new quarter) means editing each data source individually. How can retention rule configurations be reused to streamline this maintenance? ### Description Import Rules Rule Type - The proposal is to add an import rules type, that includes another ruleset dynamically at runtime. Effectively, this generalizes the default rules approach, but allows the user to import other rules anywhere in the rule chain. The imported rules could be from another data source or from an independent ruleset similar to the existing default rules. A UI could be provided to edit synthetic rulesets without changing any APIs. Similarly, the UI could restrict users to only import rules from synthetic rulesets if desired. Pull Request [#10129](https://github.com/apache/druid/pull/10129) has an implementation with some UI work done ### Alternatives Datasource Specific Defaults - Allow for multiple default rule lists, instead of just one. There would be a database change to store the alternative default rules for a datasource. Even if a convention is used (e.g. <datasource>__default) some method for referring to other rulesets is needed. Normalized Rule Groups - Each datasource stores a copy of rules, this could be broken out to a normalized rule table and linked to the data source. This would be a useful refactoring regardless and would allow rules to be reused. In order to provide rule reuse and per datasource customization, some additional reference would still be needed. Clone Rules - Rules from other data sources could be copied using existing APIs, but changes to the copied rules would not reflect. This is a partial solution as the user will still need to review every data source on changes. Evaluation Import Rules - Alternative to modifying the rules list when expanded, it would be possible to do the import at rule evaluation time. This requires evaluating imports and breaking cycles for each segment as well as introducing more complex test cases regarding concurrent changes to retention rules. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
