paul-rogers commented on a change in pull request #1986: Additional changes for 
Drill Metastore docs
URL: https://github.com/apache/drill/pull/1986#discussion_r384869742
 
 

 ##########
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##########
 @@ -10,6 +10,31 @@ The Metastore is a Beta feature; it is subject to change. 
We encourage you to tr
 Because the Metastore is in Beta, the SQL commands and Metastore formats may 
change in the next release.
 {% include startnote.html %}In Drill 1.17, this feature is supported for 
Parquet tables only and is disabled by default.{% include endnote.html %}
 
+## Drill Metastore introduction
+
+One of the main advantages of Drill is schema-on-read. But Drill can’t handle 
some cases with this approach, there are the issues related to Schema Evolution 
and Schema Changes.
 
 Review comment:
   No capitalization of schema evolution or schema changes.
   
   Should we explain each of these? Actually, schema change is an internal 
effect result from an external effect: ambiguous schema.
   
   Seem some recent mail list discussions for more background. Basically, Drill 
infers schema by sampling the first row. This sampling works well if all files 
have a clear, identical, unambiguous schema. However, if files contain 
different columns due to schema evolution, or columns are null (as for JSON), 
Drill can't infer the schema from the data and the user must provide a hint.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to