[
https://issues.apache.org/jira/browse/DRILL-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086806#comment-17086806
]
ASF GitHub Bot commented on DRILL-7706:
---------------------------------------
paul-rogers commented on pull request #2060: DRILL-7706: Implement Drill RDBMS
Metastore for Tables component
URL: https://github.com/apache/drill/pull/2060#discussion_r410825761
##########
File path: metastore/rdbms-metastore/README.md
##########
@@ -0,0 +1,140 @@
+# RDBMS Metastore
+
+RDBMS Metastore implementation allows to store Drill Metastore metadata in
configured RDBMS.
+
+## Configuration
+
+Currently, RDBMS Metastore is not default Drill Metastore implementation.
+To enable RDBMS Metastore create `drill-metastore-override.conf` and indicate
RDBMS Metastore class:
+
+```
+drill.metastore: {
+ implementation.class: "org.apache.drill.metastore.rdbms.RdbmsMetastore"
+}
+```
+
+### Connection properties
+
+Data source connection properties allows to indicate how to connect to Drill
Metastore database.
+
+`drill.metastore.rdbms.data_source.driver` - driver class name. Required.
+Note, driver must be included into Drill classpath prior to start up for all
databases except of SQLite.
+
+`drill.metastore.rdbms.data_source.url` - connection url. Required.
+
+`drill.metastore.rdbms.data_source.username` - database user on whose behalf
the connection is
+being made. Optional, if database does not require user to connect.
+
+`drill.metastore.rdbms.data_source.password` - database user's password.
+Optional, if database does not require user's password to connect.
+
+`drill.metastore.rdbms.data_source.properties` - specifies properties which
will be used
+during data source creation. See list of available [Hikari
properties](https://github.com/brettwooldridge/HikariCP)
+for more details.
+
+### Default configuration
+
+Out of the box, Drill RDBMS Metastore is configured to use embedded file
system based SQLite database.
+It will be created locally in user's home directory under
`${drill.exec.zk.root}"/metastore` location.
+
+Default setup can be used only in Drill embedded mode.
+If SQLite setup will be used in distributed mode, each drillbit will have it's
own SQLite instance
+which will lead to bogus results during queries execution.
+In distributed mode, database instance must be accessible for all drillbits.
+
+### Custom configuration
+
+`drill-metastore-override.conf` is used to customize connection details to the
Drill Metastore database.
+See `drill-metastore-override-example.conf` for more details.
+
+#### Example of PostgreSQL configuration
+
+```
+drill.metastore: {
+ implementation.class: "org.apache.drill.metastore.rdbms.RdbmsMetastore",
+ rdbms: {
+ data_source: {
+ driver: "org.postgresql.Driver",
+ url:
"jdbc:postgresql://localhost:1234/mydb?currentSchema=drill_metastore",
+ username: "user",
+ password: "password"
+ }
+ }
+}
+```
+
+Note: PostgreSQL JDBC driver must be present in Drill classpath.
+
+#### Example of MySQL configuration
+
+```
+drill.metastore: {
+ implementation.class: "org.apache.drill.metastore.rdbms.RdbmsMetastore",
+ rdbms: {
+ data_source: {
+ driver: "com.mysql.cj.jdbc.Driver",
+ url: "jdbc:mysql://localhost:1234/drill_metastore",
+ username: "user",
+ password: "password"
+ }
+ }
+}
+```
+
+Note: MySQL JDBC driver must be present in Drill classpath.
+
+##### Driver version
+
+For MySQL connector version 6+, use `com.mysql.cj.jdbc.Driver` driver class,
+for older versions use `com.mysql.jdbc.Driver`.
+
+## Tables structure
+
+Drill Metastore consists of components. Currently, only `tables` component is
implemented.
+This component provides metadata about Drill tables, including their segments,
files, row groups and partitions.
+In Drill `tables` component unit is represented by `TableMetadataUnit` class
which is applicable to any metadata type.
+Fields which are not applicable to particular metadata type, remain unset.
+
+In RDBMS Drill Metastore each `tables` component metadata type has it's own
table.
+There are five tables: `TABLES`, `SEGMENTS`, `FILES`, `ROW_GROUPS`,
`PARTITIONS`.
+These tables structure and primary keys are defined based on fields specific
for each metadata type.
+See `src/main/resources/db/changelog/changes/initial_ddls.yaml` for more
details.
Review comment:
In the RDBMS implementation of the Drill Metastore, the `tables` component
includes five tables, one for each metadata type. The five tables are:
`TABLES`, `SEGMENTS`, `FILES`, `ROW_GROUPS`, **and** `PARTITIONS`. See
`src/main/resources/db/changelog/changes/initial_ddls.yaml` for the schema and
indexes of each table.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Drill RDBMS Metastore
> ---------------------
>
> Key: DRILL-7706
> URL: https://issues.apache.org/jira/browse/DRILL-7706
> Project: Apache Drill
> Issue Type: New Feature
> Affects Versions: 1.17.0
> Reporter: Arina Ielchiieva
> Assignee: Arina Ielchiieva
> Priority: Major
> Fix For: 1.18.0
>
>
> Currently Drill has only one Metastore implementation based on Iceberg
> tables. Iceberg tables are file based storage that supports concurrent writes
> / reads but required to be placed on distributed file system.
> This Jira aims to implement Drill RDBMS Metastore which will store Drill
> Metastore metadata in the database of the user's choice. Currently,
> PostgreSQL and MySQL databases are supported, others might work as well but
> no testing was done. Also out of box for demonstration / testing purposes
> Drill will setup SQLite file based embedded database but this is only
> applicable for Drill in embedded mode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)