vvysotskyi commented on a change in pull request #1986: Additional changes for Drill Metastore docs URL: https://github.com/apache/drill/pull/1986#discussion_r387065974
########## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ########## @@ -48,42 +83,52 @@ drill.metastore: { } ``` -Note, that currently out of box Iceberg Metastore is available and is the default one. Though any custom - implementation can be added by placing the JAR into classpath which has the implementation of - `org.apache.drill.metastore.Metastore` interface and indicating custom class in the `drill.metastore.implementation.class`. - ### Metastore Components -Metastore can store metadata for various components: tables, views, etc. -Current implementation provides fully functioning support for tables component. -Views component support is not implemented but contains stub methods to show -how new Metastore components like UDFs, storage plugins, etc. can be added in the future. +The Drill 1.17 version of the Metastore stores metadata about tables: the table schema and table statistics. +The Metastore is an active subproject of Drill, See [DRILL-6552](https://issues.apache.org/jira/browse/DRILL-6552) for more information. + +### Table Metadata + +Table Metadata includes the following info: + + - Table schema, column name, type, nullability, scale and precision if available, and other info. For details please + refer to [Schema provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes). + - Table statistics. This itself has two categories: + - Summary statistics: `MIN`, `MAX`, `NULL count`, etc. + - Detail statistics: histograms, `NDV`, etc. -### Metastore Tables +Schema information and summary statistics also computed and stored for table segments, files, row groups, partitions. -Metastore Tables component contains metadata about Drill tables, including general information, as well as -information about table segments, files, row groups, partitions. +The detailed metadata schema is described [here](https://github.com/apache/drill/tree/master/metastore/metastore-api#metastore-tables). +You can try out the metadata, and get a sense of what is available, by using the + [Inspect the Metastore using `INFORMATION_SCHEMA` tables]({{site.baseurl}}/docs/using-drill-metastore/#inspect-the-metastore-using-information_schema-tables) tutorial. -Full table metadata consists of two major concepts: general information and top-level segments metadata. -Table general information contains basic table information and corresponds to the `BaseTableMetadata` class. +Every table described by the Metastore may be a bare file or one or more files that reside in one or more directories. -A table can be non-partitioned and partitioned. Non-partitioned tables have only one top-level segment -which is called default (`MetadataInfo#DEFAULT_SEGMENT_KEY`). Partitioned tables may have several top-level segments. -Each top-level segment can include metadata about inner segments, files, row groups, and partitions. +If a table consists of a single directory or file, then it is non-partitioned. The single directory can contain any number of files. +Larger tables tend to have subdirectories. Each subdirectory is a partition and such a table are called "partitioned". +Please refer to [Exposing Drill Metastore metadata through `INFORMATION_SCHEMA` tables]({{site.baseurl}}/docs/using-drill-metastore/#exposing-drill-metastore-metadata-through-information_schema-tables) + for information, how to query partitions and segments metadata. -A unique table identifier in Metastore Tables is a combination of storage plugin, workspace, and table name. -Table metadata inside is grouped by top-level segments, unique identifier of the top-level segment and its metadata -is storage plugin, workspace, table name, and metadata key. +A traditional database divides tables into schemas and tables. +Drill can connect to any number of data sources, each of which may have its own schema. +As a result, the Metastore labels tables with a combination of (plugin configuration name, workspace name, table name). +Note that if before renaming any of these items, you must delete table's Metadata entry and recreate it after renaming. ### Related Session/System Options -The following options are set via `ALTER SYSTEM SET`, or `ALTER SESSION SET` or via the Drill Web console. +The metastore provides a number of options to fit your environment. The default options are find in most cases. +The options are set via `ALTER SYSTEM SET`, `ALTER SESSION SET` or the Drill Web console. + +In general, you should set the options via `ALTER SYSTEM` so that they take effect for all users. Review comment: thanks for pointing this, replaced it with admin ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services