paul-rogers commented on a change in pull request #1986: Additional changes for Drill Metastore docs URL: https://github.com/apache/drill/pull/1986#discussion_r384869742
########## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ########## @@ -10,6 +10,31 @@ The Metastore is a Beta feature; it is subject to change. We encourage you to tr Because the Metastore is in Beta, the SQL commands and Metastore formats may change in the next release. {% include startnote.html %}In Drill 1.17, this feature is supported for Parquet tables only and is disabled by default.{% include endnote.html %} +## Drill Metastore introduction + +One of the main advantages of Drill is schema-on-read. But Drill can’t handle some cases with this approach, there are the issues related to Schema Evolution and Schema Changes. Review comment: No capitalization of schema evolution or schema changes. Should we explain each of these? Actually, schema change is an internal effect result from an external effect: ambiguous schema. Seem some recent mail list discussions for more background. Basically, Drill infers schema by sampling the first row. This sampling works well if all files have a clear, identical, unambiguous schema. However, if files contain different columns due to schema evolution, or columns are null (as for JSON), Drill can't infer the schema from the data and the user must provide a hint. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services