paul-rogers commented on a change in pull request #1953: Add docs for Drill Metastore URL: https://github.com/apache/drill/pull/1953#discussion_r366679074
########## File path: _docs/sql-reference/sql-commands/007-analyze-table-refresh-metadata.md ########## @@ -0,0 +1,158 @@ +--- +title: "ANALYZE TABLE REFRESH METADATA" +parent: "SQL Commands" +date: 2020-01-13 +--- + +Starting from Drill 1.17, you can store table metadata (including schema and computed statistics) into Drill Metastore. +This metadata will be used when querying a table for more optimal plan creation. + +{% include startnote.html %}In Drill 1.17, this feature is supported for Parquet tables only and is disabled by default.{% include endnote.html %} + +To enable Drill Metastore usage, the following option `metastore.enabled` should be set to `true`, as shown: + + SET `metastore.enabled` = true; + +Alternatively, you can enable the option in the Drill Web UI at `http://<drill-hostname-or-ip-address>:8047/options`. + +## Syntax Review comment: (We just told the user that a metastore exists and how to enable it. Now we tell them the syntax of commands. We need a transition.) Once you enable the Metastore, the next step is to populate it with data. Drill can query a table whether that table has a Metastore entry or not. (If you are familiar with Hive, then you know that Hive requires that all tables have Hive Metastore entries before you can query them.) In Drill, only add data to the Metastore when doing so improves query performance. In general, large tables benefit from statistics more than small tables do. Unlike Hive, Drill does not require you to declare a schema. Instead, Drill infers the schema by scanning your table. Drill not only infers the schema, it optionally computes statistics about your table. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services