paul-rogers opened a new pull request, #12647: URL: https://github.com/apache/druid/pull/12647
The [Druid catalog](https://github.com/apache/druid/issues/12546) provides a collection of metadata "hints" about tables (datasources, input sources, views, etc.) within Druid. This PR provides the foundation: the DB and REST layer, but not yet the integration with the Calcite SQL layer. The DB layer extends what is done for other Druid metadata tables. The semantic ("business logic") layer provides the usual CRUD operations on tables, as well as operations to sync metadata between the Coordinator and Broker. A synchronization layer handles the Coordinator/Broker sync: the Broker polls for the information it does not yet have: the Coordinator pushes updates to known Brokers. The entire design is pretty standard and follows Druid patterns. The key difference is the rather extreme lengths taken by the implementation to ensure each bit is easily testable without mocks. That means many interfaces which can be implemented in multiple ways. While the entire catalog mechanism is present in this PR, the Guice configuration is not yet enabled, meaning that the catalog is not yet enabled. This project has created, or depends on, multiple in-flight PRs and it is becoming a bit complex to combine them all in a private branch. This is one of several PRs that provide slices of catalog work. We'll want to create integration tests when we enable the feature, and that work is waiting for the [new IT PR](https://github.com/apache/druid/pull/12368) to be merged. The next step in the catalog work is to integrate the catalog with the Druid planner. For that, we'll need the [planner test framework](https://github.com/apache/druid/pull/12545) to be merged. This code will likely evolve as we work on the SQL layer. Some of that work has already been done in a private branch and suggests that the present code is pretty much on the right track: we'll just expand the table and column definitions as needed. This is a great opportunity for reviewers to provide guidance on the basic catalog mechanism before we start building SQL integration on top. <hr> This PR has: - [X] been self-reviewed. - [X] has a design document [here](https://github.com/apache/druid/issues/12546). - [ ] added documentation for new or modified features or behaviors. (Not yet: the functionality is not yet user visible.) - [X] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [X] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [X] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. (Not yet, waiting for [this PR](https://github.com/apache/druid/pull/12368) to be merged.) - [X] been tested in a test Druid cluster. (A simple one, on a Mac, using a Python client to verify the API.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
