paul-rogers opened a new pull request, #12647:
URL: https://github.com/apache/druid/pull/12647

   The [Druid catalog](https://github.com/apache/druid/issues/12546) provides a 
collection of metadata "hints" about tables (datasources, input sources, views, 
etc.) within Druid. This PR provides the foundation: the DB and REST layer, but 
not yet the integration with the Calcite SQL layer.
   
   The DB layer extends what is done for other Druid metadata tables.  The 
semantic ("business logic") layer provides the usual CRUD operations on tables, 
as well as operations to sync metadata between the Coordinator and Broker. A 
synchronization layer handles the Coordinator/Broker sync: the Broker polls for 
the information it does not yet have: the Coordinator pushes updates to known 
Brokers. 
   
   The entire design is pretty standard and follows Druid patterns. The key 
difference is the rather extreme lengths taken by the implementation to ensure 
each bit is easily testable without mocks. That means many interfaces which can 
be implemented in multiple ways.
   
   While the entire catalog mechanism is present in this PR, the Guice 
configuration is not yet enabled, meaning that the catalog is not yet enabled. 
This project has created, or depends on, multiple in-flight PRs and it is 
becoming a bit complex to combine them all in a private branch. This is one of 
several PRs that provide slices of catalog work.
   
   We'll want to create integration tests when we enable the feature, and that 
work is waiting for the [new IT PR](https://github.com/apache/druid/pull/12368) 
to be merged.
   
   The next step in the catalog work is to integrate the catalog with the Druid 
planner. For that, we'll need the [planner test 
framework](https://github.com/apache/druid/pull/12545) to be merged.
   
   This code will likely evolve as we work on the SQL layer. Some of that work 
has already been done in a private branch and suggests that the present code is 
pretty much on the right track: we'll just expand the table and column 
definitions as needed.
   
   This is a great opportunity for reviewers to provide guidance on the basic 
catalog mechanism before we start building SQL integration on top.
   
   <hr>
   
   This PR has:
   - [X] been self-reviewed.
   - [X] has a design document 
[here](https://github.com/apache/druid/issues/12546).
   - [ ] added documentation for new or modified features or behaviors. (Not 
yet: the functionality is not yet user visible.)
   - [X] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [X] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [X] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests. (Not yet, waiting for [this 
PR](https://github.com/apache/druid/pull/12368) to be merged.)
   - [X] been tested in a test Druid cluster. (A simple one, on a Mac, using a 
Python client to verify the API.)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to