[drill] branch gh-pages updated: Update docs for Metastore to point that all format plugins are supported

volodymyr Sun, 22 Mar 2020 00:10:46 -0700

This is an automated email from the ASF dual-hosted git repository.

volodymyr pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/drill.git



The following commit(s) were added to refs/heads/gh-pages by this push:
     new 0e17eea  Update docs for Metastore to point that all format plugins 
are supported
0e17eea is described below

commit 0e17eea19aca27b88c98778fcfb7057a45501ab9
Author: Volodymyr Vysotskyi <[email protected]>
AuthorDate: Tue Mar 17 19:56:23 2020 +0200

    Update docs for Metastore to point that all format plugins are supported
---
 .../035-plugin-configuration-basics.md             |  36 ++++
 .../plugins/080-rdbms-storage-plugin.md            |  72 ++++---
 .../plugins/114-image-metadata-format-plugin.md    |  84 ++++-----
 .../drill-metastore/010-using-drill-metastore.md   | 207 +++++++++++++++++++--
 .../005-querying-a-file-system-introduction.md     |   6 +-
 .../query-a-file-system/009-querying-avro-files.md |  28 ++-
 .../007-analyze-table-refresh-metadata.md          |  17 +-
 .../sql-commands/021-create-schema.md              |  69 ++++++-
 images/7671b34d6e8a4d050f75278f10f1a08.jpg         | Bin 0 -> 45877 bytes
 9 files changed, 419 insertions(+), 100 deletions(-)

diff --git a/_docs/connect-a-data-source/035-plugin-configuration-basics.md 
b/_docs/connect-a-data-source/035-plugin-configuration-basics.md
index 3e1d866..79a8d82 100644
--- a/_docs/connect-a-data-source/035-plugin-configuration-basics.md
+++ b/_docs/connect-a-data-source/035-plugin-configuration-basics.md
@@ -133,6 +133,8 @@ The following table describes the attributes you configure 
for storage plugins i
 
 You set the formats attributes, such as skipFirstLine, in the `formats` area 
of the storage plugin configuration. When setting attributes for text files, 
such as CSV, you also need to set the `sys.options` property 
`exec.storage.enable_new_text_reader` to true (the default). For more 
information and examples of using formats for text files, see ["Text Files: 
CSV, TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/). 
 
 
+# Table Function Parameters
+
 ## Using the Formats Attributes as Table Function Parameters
 
 In Drill version 1.4 and later, you can also set the formats attributes 
defined above on a per query basis. To pass parameters to the format plugin, 
use the table function syntax:  
@@ -147,6 +149,40 @@ fieldDelimiter => ',', extractHeader => true))``
 
 For more information about format plugin configuration see ["Text Files: CSV, 
TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/).  
 
+## Specifying the Schema as Table Function Parameter
+
+Table schemas normally reside in the root folder of each table. You can also 
specify a schema for an individual query
+ using a table function and specifying the `SCHEMA` property. You can combine 
the schema with format plugin properties.
+ The syntax is similar to the [CREATE OR REPLACE 
SCHEMA]({{site.baseurl}}/docs/create-or-replace-schema/#syntax):
+
+```
+SELECT a, b FROM TABLE (table_name(
+SCHEMA => 'inline=(column_name data_type [nullability] [format] [default] 
[properties {prop='val', ...})]'))
+```
+
+You can specify the schema inline within the query. For example:
+
+```
+select * from table(dfs.tmp.`text_table`(
+schema => 'inline=(col1 date properties {`drill.format` = `yyyy-MM-dd`}) 
+properties {`drill.strict` = `false`}'))
+```
+
+Alternatively, you can also specify the path to a schema file. For example:
+
+```
+select * from table(dfs.tmp.`text_table`(schema => 'path=`/tmp/my_schema`'))
+```
+
+The following example demonstrates applying provided schema alongside with 
format plugin table function parameters.
+Suppose that you have a CSV file with headers and with a custom extension: 
`csvh-test`. You can combine the schema with format plugin properties:
+
+```
+select * from table(dfs.tmp.`cars.csvh-test`(type => 'text', 
+fieldDelimiter => ',', extractHeader => true,
+schema => 'inline=(col1 date)'))
+```
+
 ## Using Other Attributes
 
 The configuration of other attributes, such as `size.calculator.enabled` in 
the `hbase` plugin and `configProps` in the `hive` plugin, are 
implementation-dependent and beyond the scope of this document.
diff --git a/_docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md 
b/_docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md
index ea84e87..f3d4bad 100644
--- a/_docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md
+++ b/_docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md
@@ -1,6 +1,6 @@
 ---
 title: "RDBMS Storage Plugin"
-date: 2018-12-08
+date: 2020-03-17
 parent: "Connect a Data Source"
 ---
 Apache Drill supports querying a number of RDBMS instances. This allows you to 
connect your traditional databases to your Drill cluster so you can have a 
single view of both your relational and NoSQL datasources in a single system. 
@@ -9,14 +9,25 @@ As with any source, Drill supports joins within and between 
all systems. Drill a
 
 ## Using the RDBMS Storage Plugin
 
-Drill is designed to work with any relational datastore that provides a JDBC 
driver. Drill is actively tested with Postgres, MySQL, Oracle, MSSQL and Apache 
Derby. For each system, you will follow three basic steps for setup:
+Drill is designed to work with any relational datastore that provides a JDBC 
driver. Drill is actively tested with
+ Postgres, MySQL, Oracle, MSSQL, Apache Derby and H2. For each system, you 
will follow three basic steps for setup:
 
   1. [Install Drill]({{ site.baseurl 
}}/docs/installing-drill-in-embedded-mode), if you do not already have it 
installed.
-  2. Copy your database's JDBC driver into the jars/3rdparty directory. 
(You'll need to do this on every node.)  
+  2. Copy your database's JDBC driver into the `jars/3rdparty` directory. 
(You'll need to do this on every node.)  
   3. Restart Drill. See [Starting Drill in Distributed 
Mode]({{site.baseurl}}/docs/starting-drill-in-distributed-mode/).
-  4. Add a new storage configuration to Drill through the Web UI. Example 
configurations for [Oracle](#Example-Oracle-Configuration), [SQL 
Server](#Example-SQL-Server-Configuration), 
[MySQL](#Example-MySQL-Configuration) and 
[Postgres](#Example-Postgres-Configuration) are provided below.
-  
-**Example: Working with MySQL**
+  4. Add a new storage configuration to Drill through the Web UI. Example 
configurations for [Oracle](#example-oracle-configuration), [SQL 
Server](#example-sql-server-configuration), 
[MySQL](#example-mysql-configuration) and 
[Postgres](#example-postgres-configuration) are provided below.
+
+## Setting data source parameters in the storage plugin configuration
+
+Starting from Drill 1.18.0, new JDBC storage plugin configuration property 
`sourceParameters` was introduced to allow
+ setting data source parameters described in 
[HikariCP](https://github.com/brettwooldridge/HikariCP#configuration-knobs-baby).
+ Parameters names with incorrect naming and parameter values which are of 
incorrect data type or illegal will fail
+ storage plugin to start up.
+
+See the [Example of Postgres Configuration with `sourceParameters` 
configuration 
property](#example-of-postgres-configuration-with-sourceparameters-configuration-property)
+section for the example of usage.
+
+### Example: Working with MySQL
 
 Drill communicates with MySQL through the JDBC driver using the configuration 
that you specify in the Web UI or through the [REST 
API]({{site.baseurl}}/docs/plugin-configuration-basics/#storage-plugin-rest-api).
  
 
@@ -29,7 +40,7 @@ To configure the JDBC storage plugin:
 1. On the Storage tab, enter a name in **New Storage Plugin**. For example, 
enter `myplugin`.
    Each configuration registered with Drill must have a distinct name. Names 
are case-sensitive.  
 
-    {% include startnote.html %}The URL differs depending on your installation 
and configuration. See the [example configurations](#Example-Configurations) 
below for examples.{% include endnote.html %}  
+    {% include startnote.html %}The URL differs depending on your installation 
and configuration. See the example configurations below for examples.{% include 
endnote.html %}  
 1. Click **Create**.  
 1. In Configuration, set the required properties using JSON formatting as 
shown in the following example. Change the properties to match your 
environment.  
 
@@ -62,7 +73,7 @@ You can use the performance_schema database, which is 
installed with MySQL to qu
 ## Example Configurations
 
   
-**Example Oracle Configuration**
+### Example Oracle Configuration
 
 Download and install Oracle's Thin 
[ojdbc7.12.1.0.2.jar](http://www.oracle.com/technetwork/database/features/jdbc/default-2280470.html)
 driver and copy it to all nodes in your cluster.
 
@@ -70,10 +81,10 @@ Download and install Oracle's Thin 
[ojdbc7.12.1.0.2.jar](http://www.oracle.com/t
       type: "jdbc",
       enabled: true,
       driver: "oracle.jdbc.OracleDriver",
-      url:"jdbc:oracle:thin:user/[email protected]:1521/ORCL"
+      url: "jdbc:oracle:thin:user/[email protected]:1521/ORCL"
     }
 
-**Example SQL Server Configuration**
+### Example SQL Server Configuration
 
 For SQL Server, Drill has been tested with Microsoft's  
[sqljdbc41.4.2.6420.100.jar](https://www.microsoft.com/en-US/download/details.aspx?id=11774)
 driver. Copy this jar file to all Drillbits. 
 
@@ -83,12 +94,12 @@ For SQL Server, Drill has been tested with Microsoft's  
[sqljdbc41.4.2.6420.100.
       type: "jdbc",
       enabled: true,
       driver: "com.microsoft.sqlserver.jdbc.SQLServerDriver",
-      url:"jdbc:sqlserver://1.2.3.4:1433;databaseName=mydatabase",
-      username:"user",
-      password:"password"
+      url: "jdbc:sqlserver://1.2.3.4:1433;databaseName=mydatabase",
+      username: "user",
+      password: "password"
     }
 
-**Example MySQL Configuration**
+### Example MySQL Configuration
 
 For MySQL, Drill has been tested with MySQL's 
[mysql-connector-java-5.1.37-bin.jar](http://dev.mysql.com/downloads/connector/j/)
 driver. Copy this to all nodes.
 
@@ -96,14 +107,15 @@ For MySQL, Drill has been tested with MySQL's 
[mysql-connector-java-5.1.37-bin.j
       type: "jdbc",
       enabled: true,
       driver: "com.mysql.jdbc.Driver",
-      url:"jdbc:mysql://1.2.3.4",
-      username:"user",
-      password:"password"
+      url: "jdbc:mysql://1.2.3.4",
+      username: "user",
+      password: "password"
     }  
 
-**Example Postgres Configuration**
+### Example Postgres Configuration
 
-For Postgres, Drill has been tested with Postgres's 
[9.1-901-1.jdbc4](http://central.maven.org/maven2/org/postgresql/postgresql/) 
driver (any recent driver should work). Copy this driver file to all nodes.
+Drill is tested with the Postgres driver version 
[42.2.11](https://mvnrepository.com/artifact/org.postgresql/postgresql) (any 
recent driver should work).
+ Download and copy this driver jar to the `jars/3rdparty` folder on all nodes.
 
 {% include startnote.html %}You'll need to provide a database name as part of 
your JDBC connection string for Drill to correctly expose Postgres tables.{% 
include endnote.html %}
 
@@ -111,9 +123,9 @@ For Postgres, Drill has been tested with Postgres's 
[9.1-901-1.jdbc4](http://cen
       type: "jdbc",
       enabled: true,
       driver: "org.postgresql.Driver",
-      url:"jdbc:postgresql://1.2.3.4/mydatabase",
-      username:"user",
-      password:"password"
+      url: "jdbc:postgresql://1.2.3.4/mydatabase",
+      username: "user",
+      password: "password"
     }  
 
 You may need to qualify a table name with a schema name for Drill to return 
data. For example, when querying a table named ips, you must issue the query 
against public.ips, as shown in the following example:  
@@ -142,4 +154,20 @@ You may need to qualify a table name with a schema name 
for Drill to return data
        | 2     | 1.2.3.5  |
        +-------+----------+
 
+### Example of Postgres Configuration with `sourceParameters` configuration 
property
 
+    {
+      type: "jdbc",
+      enabled: true,
+      driver: "org.postgresql.Driver",
+      url: "jdbc:postgresql://1.2.3.4/mydatabase?defaultRowFetchSize=2",
+      username: "user",
+      password: "password",
+      sourceParameters: {
+        "minimumIdle": 5,
+        "autoCommit": false,
+        "connectionTestQuery": "select version() as postgresql_version",
+        "dataSource.cachePrepStmts": true,
+        "dataSource.prepStmtCacheSize": 250
+      }
+    }  
diff --git 
a/_docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md 
b/_docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md
index 27c53e4..313116f 100644
--- a/_docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md
+++ b/_docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md
@@ -1,6 +1,6 @@
 ---
 title: "Image Metadata Format Plugin"
-date: 2018-06-13 18:31:03 UTC
+date: 2020-03-17
 parent: "Connect a Data Source"
 ---
 
@@ -39,7 +39,7 @@ To configure Drill to read image metadata, you must modify 
the extensions sectio
 **Note:** The result does not include file names, but you can use [implicit 
columns]({{site.baseurl}}/docs/querying-a-file-system-introduction/#implicit-columns)
 to get file names, full paths, fully qualified names, and file suffixes.  
 
 
-##Attributes
+## Attributes
 
 The following table lists configuration attributes:  
 
@@ -49,54 +49,53 @@ fileSystemMetadata|true|Set to true to extract filesystem 
metadata including the
 descriptive|true|Set to true to extract metadata in a human-readable string 
format. Set false to extract metadata in a machine-readable typed format.
 timeZone|null|Specify the time zone to interpret the timestamp with no time 
zone information. If the timestamp includes the time zone information, this 
value is ignored. If null is set, the local time zone is used.  
 
-##Examples  
+## Examples  
+
+To follow along with the examples, start by downloading the following image to 
your `\tmp` directory.
+
+[![image]({{ site.baseurl }}/images/7671b34d6e8a4d050f75278f10f1a08.jpg)]({{ 
site.baseurl }}/images/7671b34d6e8a4d050f75278f10f1a08.jpg)
 
 A Drill query on a JPEG file with the property descriptive: true
 
-       0: jdbc:drill:zk=local> select FileName, * from 
dfs.`4349313028_f69ffa0257_o.jpg`;  
-       
+----------+----------+--------------+--------+------------+-------------+--------------+----------+-----------+------------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+------+------+----------+------------+------------------+-----+---------------+-----------+------+---------+----------+
-       | FileName | FileSize | FileDateTime | Format | PixelWidth | 
PixelHeight | BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | 
HasAlpha | Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | 
AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | GPS 
| ExifThumbnail | Photoshop | IPTC | Huffman | FileType |
-       
+----------+----------+--------------+--------+------------+-------------+--------------+----------+-----------+------------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+------+------+----------+------------+------------------+-----+---------------+-----------+------+---------+----------+
-       | 4349313028_f69ffa0257_o.jpg | 257213 bytes | Fri Mar 09 12:09:34 
+08:00 2018 | JPEG | 1199 | 800 | 24 | 96 | 96 | Unknown (0) | RGB | false | 
00:00:00 | Unknown | 0 | Unknown | 0 | 0 | 
{"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"800 
pixels","ImageWidth":"1199 pixels","NumberOfComponents":"3","Component1":"Y 
component: Quantization table 0, Sampling factors 2 horiz/2 
vert","Component2":"Cb component: Quantization table 1, Sampling factors 1 
horiz/1 vert" [...]
-       
+----------+----------+--------------+--------+------------+-------------+--------------+----------+-----------+------------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+------+------+----------+------------+------------------+-----+---------------+-----------+------+---------+----------+
- 
+        select FileName, * from dfs.tmp.`7671b34d6e8a4d050f75278f10f1a08.jpg`;
+        
+-------------------------------------+-------------+---------------------------------+--------+------------+-------------+--------------+-------------+----------+-----------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+----------------------------------------------------------------------------------+------------------------------------------------------------------------------+---------------------------------------
 [...]
+        |              FileName               |  FileSize   |          
FileDateTime           | Format | PixelWidth | PixelHeight | BitsPerPixel | 
Orientaion  | DPIWidth | DPIHeight | ColorMode | HasAlpha | Duration | 
VideoCodec | FrameRate | AudioCodec | AudioSampleSize | AudioSampleRate |       
                                JPEG                                       |    
                             JpegComment                                  |     
                                   [...]
+        
+-------------------------------------+-------------+---------------------------------+--------+------------+-------------+--------------+-------------+----------+-----------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+----------------------------------------------------------------------------------+------------------------------------------------------------------------------+---------------------------------------
 [...]
+        | 7671b34d6e8a4d050f75278f10f1a08.jpg | 45877 bytes | Tue Mar 17 
21:37:09 +02:00 2020 | JPEG   | 604        | 453         | 24           | 
Unknown (0) | 0        | 0         | RGB       | false    | 00:00:00 | Unknown  
  | 0         | Unknown    | 0               | 0               | 
{"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"453 
pixels","ImageWidth":"604 pixels","NumberOfComponents":"3","Component1":"Y 
component: Quantization table 0, Sampling factors 2 [...]
+        
+-------------------------------------+-------------+---------------------------------+--------+------------+-------------+--------------+-------------+----------+-----------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+----------------------------------------------------------------------------------+------------------------------------------------------------------------------+---------------------------------------
 [...]
 
 A Drill query on a JPEG file with the property descriptive: false    
 
-       0: jdbc:drill:zk=local> select FileName, * from 
dfs.`4349313028_f69ffa0257_o.jpg`;  
-       
+----------+----------+--------------+--------+------------+-------------+--------------+----------+-----------+------------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+------+------+----------+------------+------------------+-----+---------------+-----------+------+---------+----------+
-       | FileName | FileSize | FileDateTime | Format | PixelWidth | 
PixelHeight | BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | 
HasAlpha | Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | 
AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | GPS 
| ExifThumbnail | Photoshop | IPTC | Huffman | FileType |
-       
+----------+----------+--------------+--------+------------+-------------+--------------+----------+-----------+------------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+------+------+----------+------------+------------------+-----+---------------+-----------+------+---------+----------+
-       | 4349313028_f69ffa0257_o.jpg | 257213 | 2018-03-09 04:09:34.0 | JPEG | 
1199 | 800 | 24 | 96.0 | 96.0 | 0 | RGB | false | 0 | Unknown | 0.0 | Unknown | 
0 | 0.0 | 
{"CompressionType":0,"DataPrecision":8,"ImageHeight":800,"ImageWidth":1199,"NumberOfComponents":3,"Component1":{"ComponentId":1,"HorizontalSamplingFactor":2,"VerticalSamplingFactor":2,"QuantizationTableNumber":0},"Component2":{"ComponentId":2,"HorizontalSamplingFactor":1,"VerticalSamplingFactor":1,"QuantizationTableNumber
 [...]
-       
+----------+----------+--------------+--------+------------+-------------+--------------+----------+-----------+------------+-----------+----------+----------+------------+-----------+------------+-----
  
+        select FileName, * from dfs.tmp.`7671b34d6e8a4d050f75278f10f1a08.jpg`;
+        
+-------------------------------------+----------+-----------------------+--------+------------+-------------+--------------+------------+----------+-----------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+----------------------------------------------------------------------------------+------------------------------------------------------------------------------+-----------------------------------------------------
 [...]
+        |              FileName               | FileSize |     FileDateTime    
  | Format | PixelWidth | PixelHeight | BitsPerPixel | Orientaion | DPIWidth | 
DPIHeight | ColorMode | HasAlpha | Duration | VideoCodec | FrameRate | 
AudioCodec | AudioSampleSize | AudioSampleRate |                                
       JPEG                                       |                             
    JpegComment                                  |                              
         JFIF           [...]
+        
+-------------------------------------+----------+-----------------------+--------+------------+-------------+--------------+------------+----------+-----------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+----------------------------------------------------------------------------------+------------------------------------------------------------------------------+-----------------------------------------------------
 [...]
+        | 7671b34d6e8a4d050f75278f10f1a08.jpg | 45877    | 2020-03-17 
19:37:09.0 | JPEG   | 604        | 453         | 24           | 0          | 
0.0      | 0.0       | RGB       | false    | 0        | Unknown    | 0.0       
| Unknown    | 0               | 0.0             | 
{"CompressionType":0,"DataPrecision":8,"ImageHeight":453,"ImageWidth":604,"NumberOfComponents":3,"Component1":{"ComponentId":1,"HorizontalSamplingFactor":2,"VerticalSamplingFactor":2,"QuantizationTableNumber":0},"C
 [...]
+        
+-------------------------------------+----------+-----------------------+--------+------------+-------------+--------------+------------+----------+-----------+-----------+----------+----------+------------+-----------+------------+-----------------+-----------------+----------------------------------------------------------------------------------+------------------------------------------------------------------------------+-----------------------------------------------------
 [...]
 
 Retrieving GPS location data from the Exif metadata for the use of GIS 
functions.
 
-       0: jdbc:drill:zk=local> select t.GPS.GPSLatitude as lat, 
t.GPS.GPSLongitude as lon from dfs.`4349313028_f69ffa0257_o.jpg` t;
-       +--------------------+----------------------+
-       |        lat         |         lon          |
-       +--------------------+----------------------+
-       | 47.53777313232332  | -122.03510284423795  |
-       +--------------------+----------------------+  
-       
-Retrieving the images that are larger than 640 x 480 pixels.
-       
-       0: jdbc:drill:zk=local> select FileName, PixelWidth, PixelHeight from 
dfs.`/images/*.png` where PixelWidth >= 640 and PixelHeight >= 480;
-       +--------------------------+-------------+--------------+
-       |         FileName         | PixelWidth  | PixelHeight  |
-       +--------------------------+-------------+--------------+
-       | 1.png                    | 2788        | 1758         |
-       | 1500x500.png             | 1500        | 500          |
-       | 2.png                    | 2788        | 1758         |
-       | 9784873116914_1.png      | 874         | 1240         |
-       | Driven-Example-Load.png  | 1208        | 970          |
-       | features-diagram.png     | 1170        | 644          |
-       | hal1.png                 | 1223        | 772          |
-       | hal2.png                 | 1184        | 768          |
-       | image-3.png              | 1200        | 771          |
-       | image-4.png              | 1200        | 771          |
-       | image002.png             | 1689        | 695          |
-       +--------------------------+-------------+--------------+  
+        select t.GPS.GPSLatitude as lat, t.GPS.GPSLongitude as lon from 
dfs.tmp.`7671b34d6e8a4d050f75278f10f1a08.jpg` t;
+        +-------------------+--------------------+
+        |        lat        |        lon         |
+        +-------------------+--------------------+
+        | 50.46355547157135 | 30.508668422733077 |
+        +-------------------+--------------------+
+
+Download all `png` images from [Logos]({{ site.baseurl }}/images/logos/) page 
and place them to `/tmp/logos`
+ directory to examine the following example. 
+
+An example query to retrieve the images that are less than 640 x 480 pixels.
+
+        select FileName, PixelWidth, PixelHeight from dfs.tmp.`logos` where 
PixelWidth < 640 and PixelHeight < 480;
+        +---------------------+------------+-------------+
+        |      FileName       | PixelWidth | PixelHeight |
+        +---------------------+------------+-------------+
+        | redbusLogo.png      | 500        | 325         |
+        | option3-io-logo.png | 194        | 76          |
+        | sanchezLogo.png     | 235        | 85          |
+        | IORA_NUS.png        | 375        | 253         |
+        +---------------------+------------+-------------+
 
 ## Supported File Formats
 
@@ -798,6 +797,3 @@ EPS.ImageWidth|INTEGER|Width of the image in pixels
 EPS.ImageHeight|INTEGER|Height of the image in pixels
 EPS.ColorType|INTEGER|Color type: `1` (Bitmap/grayscale), `2` (Lab), `3`: RGB 
or `4` (CMYK)
 EPS.RamSize|INTEGER|Ram Size to keep the image in bytes  
-
-     
-       
diff --git 
a/_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md 
b/_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
index e12d31b..2d4a28c 100644
--- a/_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
+++ b/_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
@@ -1,14 +1,17 @@
 ---
 title: "Using Drill Metastore"
 parent: "Drill Metastore"
-date: 2020-03-03
+date: 2020-03-17
 ---
 
 Drill 1.17 introduces the Drill Metastore which stores the table schema and 
table statistics. Statistics allow Drill to better create optimal query plans.
 
-The Metastore is a Beta feature; it is subject to change. We encourage you to 
try it and provide feedback.
-Because the Metastore is in Beta, the SQL commands and Metastore formats may 
change in the next release.
-{% include startnote.html %}In Drill 1.17, this feature is supported for 
Parquet tables only and is disabled by default.{% include endnote.html %}
+The Metastore is a beta feature and is subject to change.
+In particular, the SQL commands and Metastore format may change based on your 
experience and feedback.
+{% include startnote.html %}
+In Drill 1.17, Metastore supports only tables in Parquet format. The feature 
is disabled by default.
+In Drill 1.18, Metastore supports all format plugins (except MaprDB) for the 
file system plugin. The feature is still disabled by default.
+{% include endnote.html %}
 
 ## Drill Metastore introduction
 
@@ -103,13 +106,13 @@ Schema information and summary statistics also computed 
and stored for table seg
 
 The detailed metadata schema is described 
[here](https://github.com/apache/drill/tree/master/metastore/metastore-api#metastore-tables).
 You can try out the metadata to get a sense of what is available, by using the
- [Inspect the Metastore using `INFORMATION_SCHEMA` 
tables]({{site.baseurl}}/docs/using-drill-metastore/#inspect-the-metastore-using-information_schema-tables)
 tutorial.
+ [Inspect the Metastore using `INFORMATION_SCHEMA` 
tables](#inspect-the-metastore-using-information_schema-tables) tutorial.
 
 Every table described by the Metastore may be a bare file or one or more files 
that reside in one or more directories.
 
 If a table consists of a single directory or file, then it is non-partitioned. 
The single directory can contain any number of files.
 Larger tables tend to have subdirectories. Each subdirectory is a partition 
and such a table are called "partitioned".
-Please refer to [Exposing Drill Metastore metadata through 
`INFORMATION_SCHEMA` 
tables]({{site.baseurl}}/docs/using-drill-metastore/#exposing-drill-metastore-metadata-through-information_schema-tables)
+Please refer to [Exposing Drill Metastore metadata through 
`INFORMATION_SCHEMA` 
tables](#exposing-drill-metastore-metadata-through-information_schema-tables)
  for information, how to query partitions and segments metadata.
 
 A traditional database divides tables into schemas and tables.
@@ -117,6 +120,45 @@ Drill can connect to any number of data sources, each of 
which may have its own
 As a result, the Metastore labels tables with a combination of (plugin 
configuration name, workspace name, table name).
 Note that if before renaming any of these items, you must delete table's 
Metadata entry and recreate it after renaming.
 
+### Using schema provisioning feature with Drill Metastore
+
+The Drill Metastore holds both schema and statistics information for a table. 
The `ANALYZE` command can infer the table
+ schema for well-defined tables (such as many Parquet tables). Some tables are 
too complex or variable for Drill's
+ schema inference to work well. For example, JSON tables often omit fields or 
have long runs of nulls so that Drill
+ cannot determine column types. In these cases, you can specify the correct 
schema based on your knowledge of the
+ table's structure. You specify a schema in the `ANALYZE` command using the 
+ [Schema 
provisioning]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
 syntax.
+
+Please refer to [Provisioning schema for Drill 
Metastore](#provisioning-schema-for-drill-metastore) for examples of usage.
+
+### Schema priority
+
+Drill uses metadata during both query planning and execution. Drill gives you 
multiple ways to provide a schema.
+
+When you run the `ANALYZE TABLE` command, Drill will use the following rules 
for the table schema to be stored in the Metastore. In priority order:
+
+* A schema provided in the table function.
+* A schema file, created with `CREATE OR REPLACE SCHEMA`, in the table root 
directory.
+* Schema inferred from file data.
+
+To plan a query, Drill requires information about your file partitions (if 
any) and about row and column cardinality.
+Drill does not use the provided schema for planning as it does not provide 
this metadata. Instead, at plan time Drill
+obtains metadata from one of the following, again in priority order:
+
+* The Drill Metastore, if available.
+* Inferred from file data. Drill scans the table's directory structure to 
identify partitions.
+ Drill estimates row counts based on the file size. Drill uses default 
estimates for column cardinality.
+
+At query execution time, a schema tells Drill the shape of your data and how 
that data should be converted to Drill's SQL types.
+Your choices for execution-time schema, in priority order, are:
+
+* With a table function:
+   - specify an inline schema
+   - specify the path to the schema file.
+* With a schema file, created with `CREATE OR REPLACE SCHEMA`, in the table 
root directory.
+* Using the schema from the Drill Metastore, if available.
+* Infer the schema directly from file data.
+
 ### Related Session/System Options
 
 The Metastore provides a number of options to fit your environment. The 
default options are fine in most cases.
@@ -179,7 +221,7 @@ Incremental analysis will compute metadata only for files 
and partitions changed
 The command will return the following message if table statistics are 
up-to-date:
 
 ```
-apache drill (dfs.tmp)> analyze table lineitem refresh metadata;
+ANALYZE TABLE `lineitem` REFRESH METADATA;
 +-------+---------------------------------------------------------+
 |  ok   |                         summary                         |
 +-------+---------------------------------------------------------+
@@ -250,6 +292,9 @@ A table can be divided into directories, called 
"partitions". The `PARTITIONS` t
  - Applies to tables stored as Parquet files and only when stored in the `DFS` 
storage plugin.
  - Disabled by default. You must enable this feature through the 
`metastore.enabled` system/session option.
 
+### Limitations of the 1.18 release
+ - Applies to all file system storage plugin formats except for MaprDB.
+
 ### Cheat sheet of `ANALYZE TABLE` commands
 
  - Add a new table with `ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA` 
command.
@@ -315,7 +360,7 @@ Run the [ANALYZE 
TABLE]({{site.baseurl}}/docs/analyze-table-refresh-metadata) co
  be computed and stored into the Drill Metastore:
 
 ```
-apache drill> ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA;
+ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA;
 +------+-------------------------------------------------------------+
 |  ok  |                           summary                           |
 +------+-------------------------------------------------------------+
@@ -339,7 +384,7 @@ Now that we've collected table metadata, we can use it when 
we query the table,
 Rerun [ANALYZE TABLE]({{site.baseurl}}/docs/analyze-table-refresh-metadata) 
command on the `lineitem` table:
 
 ```
-apache drill> ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA;
+ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA;
 +-------+---------------------------------------------------------+
 |  ok   |                         summary                         |
 +-------+---------------------------------------------------------+
@@ -353,7 +398,7 @@ apache drill> ANALYZE TABLE dfs.tmp.lineitem REFRESH 
METADATA;
 Run the following query to inspect `lineitem` table metadata from `TABLES` 
table stored in the Metastore:
 
 ```
-apache drill> SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE 
TABLE_NAME='lineitem';
+SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_NAME='lineitem';
 
+---------------+--------------+------------+------------+--------------+---------------+----------+-----------------------+
 | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | TABLE_TYPE | TABLE_SOURCE |   
LOCATION    | NUM_ROWS |  LAST_MODIFIED_TIME   |
 
+---------------+--------------+------------+------------+--------------+---------------+----------+-----------------------+
@@ -365,7 +410,7 @@ apache drill> SELECT * FROM INFORMATION_SCHEMA.`TABLES` 
WHERE TABLE_NAME='lineit
 To obtain columns with their types and descriptions within the `lineitem` 
table, run the following query:
 
 ```
-apache drill> SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WHERE 
TABLE_NAME='lineitem';
+SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WHERE TABLE_NAME='lineitem';
 
+---------------+--------------+------------+-----------------+------------------+----------------+-------------+-------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+--------------------+---------------+--------------------+-------------+---------------+-----------+--------------+---------------------------------------------+-----------+-------------------+-----------+
 | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME |   COLUMN_NAME   | 
ORDINAL_POSITION | COLUMN_DEFAULT | IS_NULLABLE |     DATA_TYPE     | 
CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH | NUMERIC_PRECISION | 
NUMERIC_PRECISION_RADIX | NUMERIC_SCALE | DATETIME_PRECISION | INTERVAL_TYPE | 
INTERVAL_PRECISION | COLUMN_SIZE | COLUMN_FORMAT | NUM_NULLS |   MIN_VAL    |   
                MAX_VAL                   |    NDV    | EST_NUM_NON_NULLS | 
IS_NESTED |
 
+---------------+--------------+------------+-----------------+------------------+----------------+-------------+-------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+--------------------+---------------+--------------------+-------------+---------------+-----------+--------------+---------------------------------------------+-----------+-------------------+-----------+
@@ -381,7 +426,7 @@ apache drill> SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` 
WHERE TABLE_NAME='linei
 The sample `lineitem` table has two partitions. The `PARTITIONS` table 
contains an entry for each directory:
 
 ```
-apache drill (information_schema)> SELECT * FROM 
INFORMATION_SCHEMA.`PARTITIONS` WHERE TABLE_NAME='lineitem';
+SELECT * FROM INFORMATION_SCHEMA.`PARTITIONS` WHERE TABLE_NAME='lineitem';
 
+---------------+--------------+------------+--------------+---------------+---------------------+------------------+-----------------+------------------+-----------------------+
 | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | METADATA_KEY | METADATA_TYPE | 
METADATA_IDENTIFIER | PARTITION_COLUMN | PARTITION_VALUE |     LOCATION     |  
LAST_MODIFIED_TIME   |
 
+---------------+--------------+------------+--------------+---------------+---------------------+------------------+-----------------+------------------+-----------------------+
@@ -398,7 +443,7 @@ Once we are done exploring metadata we can drop the 
metadata for the `lineitem`
 Table metadata may be dropped using `ANALYZE TABLE DROP METADATA` command:
 
 ```
-apache drill> ANALYZE TABLE dfs.tmp.lineitem DROP METADATA;
+ANALYZE TABLE dfs.tmp.lineitem DROP METADATA;
 +------+----------------------------------------+
 |  ok  |                summary                 |
 +------+----------------------------------------+
@@ -417,7 +462,7 @@ Next let's gather metadata for a subset of the columns in 
the `lineitem` table.
 For the case when metadata for several columns should be computed and stored 
into the Metastore, the following command may be used:
 
 ```
-apache drill (information_schema)> ANALYZE TABLE dfs.tmp.lineitem 
COLUMNS(l_orderkey, l_partkey) REFRESH METADATA;
+ANALYZE TABLE dfs.tmp.lineitem COLUMNS(l_orderkey, l_partkey) REFRESH METADATA;
 +------+-------------------------------------------------------------+
 |  ok  |                           summary                           |
 +------+-------------------------------------------------------------+
@@ -430,7 +475,7 @@ Now, check, that metadata is collected only for specified 
columns (`MIN_VAL`, `M
  columns are present:
 
 ```
-apache drill (information_schema)> SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` 
WHERE TABLE_NAME='lineitem';
+SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WHERE TABLE_NAME='lineitem';
 
+---------------+--------------+------------+-----------------+------------------+----------------+-------------+-------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+--------------------+---------------+--------------------+-------------+---------------+-----------+---------+---------+-----------+-------------------+-----------+
 | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME |   COLUMN_NAME   | 
ORDINAL_POSITION | COLUMN_DEFAULT | IS_NULLABLE |     DATA_TYPE     | 
CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH | NUMERIC_PRECISION | 
NUMERIC_PRECISION_RADIX | NUMERIC_SCALE | DATETIME_PRECISION | INTERVAL_TYPE | 
INTERVAL_PRECISION | COLUMN_SIZE | COLUMN_FORMAT | NUM_NULLS | MIN_VAL | 
MAX_VAL |    NDV    | EST_NUM_NON_NULLS | IS_NESTED |
 
+---------------+--------------+------------+-----------------+------------------+----------------+-------------+-------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+--------------------+---------------+--------------------+-------------+---------------+-----------+---------+---------+-----------+-------------------+-----------+
@@ -442,3 +487,135 @@ apache drill (information_schema)> SELECT * FROM 
INFORMATION_SCHEMA.`COLUMNS` WH
 
+---------------+--------------+------------+-----------------+------------------+----------------+-------------+-------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+--------------------+---------------+--------------------+-------------+---------------+-----------+---------+---------+-----------+-------------------+-----------+
 17 rows selected (0.183 seconds)
 ```
+
+### Provisioning schema for Drill Metastore
+
+#### Directory and File Setup
+
+Ensure you have configured the file system storage plugin as described here:
+ [Connecting Drill to a File 
System]({{site.baseurl}}/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system).
+
+Set `store.format` to `csvh`:
+
+```
+SET `store.format`='csvh';
++------+-----------------------+
+|  ok  |        summary        |
++------+-----------------------+
+| true | store.format updated. |
++------+-----------------------+
+```
+
+Create a text table based on the sample `/tpch/nation.parquet` table from `cp` 
plugin:
+
+```
+CREATE TABLE dfs.tmp.text_nation AS
+  (SELECT *
+   FROM cp.`/tpch/nation.parquet`);
++----------+---------------------------+
+| Fragment | Number of records written |
++----------+---------------------------+
+| 0_0      | 25                        |
++----------+---------------------------+
+```
+
+Query the table `text_nation`:
+
+```
+SELECT typeof(n_nationkey),
+       typeof(n_name),
+       typeof(n_regionkey),
+       typeof(n_comment)
+FROM dfs.tmp.text_nation
+LIMIT 1;
++---------+---------+---------+---------+
+| EXPR$0  | EXPR$1  | EXPR$2  | EXPR$3  |
++---------+---------+---------+---------+
+| VARCHAR | VARCHAR | VARCHAR | VARCHAR |
++---------+---------+---------+---------+
+```
+
+Notice that the query plan contains a group scan with `usedMetastore = false`:
+
+```
+00-00    Screen : rowType = RecordType(ANY EXPR$0, ANY EXPR$1, ANY EXPR$2, ANY 
EXPR$3): rowcount = 1.0, cumulative cost = {25.1 rows, 109.1 cpu, 2247.0 io, 
0.0 network, 0.0 memory}, id = 160
+00-01      Project(EXPR$0=[TYPEOF($0)], EXPR$1=[TYPEOF($1)], 
EXPR$2=[TYPEOF($2)], EXPR$3=[TYPEOF($3)]) : rowType = RecordType(ANY EXPR$0, 
ANY EXPR$1, ANY EXPR$2, ANY EXPR$3): rowcount = 1.0, cumulative cost = {25.0 
rows, 109.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 159
+00-02        SelectionVectorRemover : rowType = RecordType(ANY n_nationkey, 
ANY n_name, ANY n_regionkey, ANY n_comment): rowcount = 1.0, cumulative cost = 
{24.0 rows, 93.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 158
+00-03          Limit(fetch=[1]) : rowType = RecordType(ANY n_nationkey, ANY 
n_name, ANY n_regionkey, ANY n_comment): rowcount = 1.0, cumulative cost = 
{23.0 rows, 92.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 157
+00-04            Scan(table=[[dfs, tmp, text_nation]], 
groupscan=[EasyGroupScan [... schema=null, usedMetastore=false...
+```
+
+#### Compute table metadata and store in the Drill Metastore
+
+Enable Drill Metastore:
+
+```
+SET `metastore.enabled` = true;
+```
+
+Specify table schema when running `ANALYZE` query:
+
+```
+ANALYZE TABLE table(dfs.tmp.`text_nation` (type=>'text', fieldDelimiter=>',', 
extractHeader=>true,
+    schema=>'inline=(
+        `n_nationkey` INT not null,
+        `n_name` VARCHAR not null,
+        `n_regionkey` INT not null,
+        `n_comment` VARCHAR not null)'
+    )) REFRESH METADATA;
++------+----------------------------------------------------------------+
+|  ok  |                            summary                             |
++------+----------------------------------------------------------------+
+| true | Collected / refreshed metadata for table [dfs.tmp.text_nation] |
++------+----------------------------------------------------------------+
+```
+
+#### Inspect the Metastore using INFORMATION_SCHEMA tables
+
+Run the following query to inspect `text_nation` table schema stored in the 
Metastore:
+
+```
+SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.`COLUMNS` WHERE 
TABLE_NAME='text_nation';
++-------------+-------------------+
+| COLUMN_NAME |     DATA_TYPE     |
++-------------+-------------------+
+| n_nationkey | INTEGER           |
+| n_name      | CHARACTER VARYING |
+| n_regionkey | INTEGER           |
+| n_comment   | CHARACTER VARYING |
++-------------+-------------------+
+```
+
+Ensure that this schema is applied to the table:
+
+```
+SELECT typeof(n_nationkey),
+       typeof(n_name),
+       typeof(n_regionkey),
+       typeof(n_comment)
+FROM dfs.tmp.text_nation
+LIMIT 1;
++--------+---------+--------+---------+
+| EXPR$0 | EXPR$1  | EXPR$2 | EXPR$3  |
++--------+---------+--------+---------+
+| INT    | VARCHAR | INT    | VARCHAR |
++--------+---------+--------+---------+
+```
+
+```
+select sum(n_nationkey) from dfs.tmp.`text_nation`;
++--------+
+| EXPR$0 |
++--------+
+| 300    |
++--------+
+```
+
+Query plan contains schema from the Metastore and group scan with 
`usedMetastore = true`:
+
+```
+00-00    Screen : rowType = RecordType(ANY EXPR$0): rowcount = 1.0, cumulative 
cost = {45.1 rows, 287.1 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 3129
+00-01      Project(EXPR$0=[$0]) : rowType = RecordType(ANY EXPR$0): rowcount = 
1.0, cumulative cost = {45.0 rows, 287.0 cpu, 2247.0 io, 0.0 network, 0.0 
memory}, id = 3128
+00-02        StreamAgg(group=[{}], EXPR$0=[SUM($0)]) : rowType = 
RecordType(ANY EXPR$0): rowcount = 1.0, cumulative cost = {44.0 rows, 286.0 
cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 3127
+00-03          Scan(table=[[dfs, tmp, text_nation]], groupscan=[EasyGroupScan 
... schema=..., usedMetastore=true]]) ...
+```
diff --git 
a/_docs/query-data/query-a-file-system/005-querying-a-file-system-introduction.md
 
b/_docs/query-data/query-a-file-system/005-querying-a-file-system-introduction.md
index 436cc75..641127c 100644
--- 
a/_docs/query-data/query-a-file-system/005-querying-a-file-system-introduction.md
+++ 
b/_docs/query-data/query-a-file-system/005-querying-a-file-system-introduction.md
@@ -27,9 +27,9 @@ Drill supports the following file types:
     * Tab-separated values (TSV, type: text)
     * Pipe-separated values (PSV, type: text)
   * Structured data files:
-    * Avro (type: avro) (This file type is experimental. See [Querying Avro 
Files]({{site.baseurl}}/docs/querying-avro-files/).)
-    * JSON (type: json)
-    * Parquet (type: parquet)
+    * [Avro]({{site.baseurl}}/docs/querying-avro-files/) (type: avro)
+    * [JSON]({{site.baseurl}}/docs/querying-json-files/) (type: json)
+    * [Parquet]({{site.baseurl}}/docs/querying-parquet-files/) (type: parquet)
 
 The extensions for these file types must match the configuration settings for
 your registered storage plugins. For example, PSV files may be defined with a
diff --git a/_docs/query-data/query-a-file-system/009-querying-avro-files.md 
b/_docs/query-data/query-a-file-system/009-querying-avro-files.md
index 394d82e..0529346 100644
--- a/_docs/query-data/query-a-file-system/009-querying-avro-files.md
+++ b/_docs/query-data/query-a-file-system/009-querying-avro-files.md
@@ -3,5 +3,29 @@ title: "Querying Avro Files"
 date: 2019-04-16
 parent: "Querying a File System"
 ---
-  
-The Avro format is experimental at this time. There are known issues when 
querying Avro files.  
+
+Drill supports files in the [Avro](https://avro.apache.org/) format.
+Starting from Drill 1.18, the Avro format supports the [Schema 
provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes) 
feature.
+
+#### Preparing example data
+
+To follow along with this example, download [sample data 
file](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/avro/map_string_to_long.avro)
+ to your `/tmp` directory.
+
+#### Selecting data from Avro files
+
+We can query all data from the `map_string_to_long.avro` file:
+
+```
+select * from dfs.tmp.`map_string_to_long.avro`
+```
+
+The query returns the following results:
+
+```
++-----------------+
+|      mapf       |
++-----------------+
+| {"ki":1,"ka":2} |
++-----------------+
+```
\ No newline at end of file
diff --git 
a/_docs/sql-reference/sql-commands/007-analyze-table-refresh-metadata.md 
b/_docs/sql-reference/sql-commands/007-analyze-table-refresh-metadata.md
index 8f6b087..8e285ef 100644
--- a/_docs/sql-reference/sql-commands/007-analyze-table-refresh-metadata.md
+++ b/_docs/sql-reference/sql-commands/007-analyze-table-refresh-metadata.md
@@ -1,7 +1,7 @@
 ---
 title: "ANALYZE TABLE REFRESH METADATA"
 parent: "SQL Commands"
-date: 2020-03-03
+date: 2020-03-17
 ---
 
 Starting from Drill 1.17, you can store table metadata (including schema and 
computed statistics) into Drill Metastore.
@@ -34,10 +34,17 @@ The name of the table or directory for which Drill will 
collect table metadata.
 Table function parameters. This syntax is only available since Drill 1.18.
 Example of table function parameters usage:
 
-    table(dfs.`table_name` (type => 'parquet', autoCorrectCorruptDates => 
true))
-
-For detailed information, please refer to
- [Using the Formats Attributes as Table Function 
Parameters]({{site.baseurl}}/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters)
+     table(dfs.tmp.`text_nation` (type=>'text', fieldDelimiter=>',', 
extractHeader=>true,
+        schema=>'inline=(
+            `n_nationkey` INT not null,
+            `n_name` VARCHAR not null,
+            `n_regionkey` INT not null,
+            `n_comment` VARCHAR not null)'
+        ))
+
+Please refer to
+ [Specifying the Schema as Table Function 
Parameter]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
+ for the details.
 
 *COLUMNS (col1, col2, ...)*
 Optional names of the column(s) for which Drill will compute and store 
statistics. The stored schema will include all
diff --git a/_docs/sql-reference/sql-commands/021-create-schema.md 
b/_docs/sql-reference/sql-commands/021-create-schema.md
index 260eb38..8c6fcb2 100644
--- a/_docs/sql-reference/sql-commands/021-create-schema.md
+++ b/_docs/sql-reference/sql-commands/021-create-schema.md
@@ -4,14 +4,24 @@ date: 2019-05-31
 parent: "SQL Commands"
 ---
 
-Starting in Drill 1.16, you can define a schema for text files using the 
CREATE OR REPLACE SCHEMA command. Schema is only available for tables 
represented by a directory. To use this feature with a single file, put the 
file inside a directory, and use the directory name to query the table.
+Starting in Drill 1.16 you can define a schema for text files. Drill places a 
schema file in the root directory of your text table and so the schema feature 
only works for tables within a directory. If you have a single-file table, 
simply create a directory to hold that file and the schema file.
 
-In Drill 1.16, this feature is in preview status and disabled by default. You 
can enable this feature by setting the `exec.storage.enable_v3_text_reader` and 
`store.table.use_schema_file` system/session options to true. The feature is 
currently only available for text (CSV) files.
+In Drill 1.17, the provided schema feature is disabled by default. Enable it 
by setting the `store.table.use_schema_file` system/session option to true:
+
+```
+ALTER SESSION SET `store.table.use_schema_file` = true
+```
+
+Next you create the schema using the `CREATE OR REPLACE SCHEMA` command as 
described in [Syntax](#syntax) section.
 
 Running this command generates a hidden `.drill.schema` file in the table’s 
root directory. The `.drill.schema` file stores the schema definition in JSON 
format. Alternatively, you can create the schema file manually. If created 
manually, the file content must comply with the structure recognized by the 
Drill.  
 
 The end of this topic provides 
[examples]({{site.baseurl}}/docs/create-or-replace-schema/#examples) that show 
how the feature is used. You may want to review this section before reading the 
reference material.  
 
+As described in [Specifying the Schema as Table Function 
Parameter]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter),
+ you can also use a table function to apply a query to individual queries. Or, 
you can place the
+ table function within a view, and query the table through the view.
+
 Please post your experience and suggestions to the 
"[user]([email protected])" mailing list.
 
 
@@ -69,6 +79,7 @@ In Drill 1.16, you must enable the following options for 
Drill to use the schema
 
 **exec.storage.enable_v3_text_reader**  
 Enables the preview "version 3" of the text (CSV) file reader. The V3 text 
reader is the only reader in Drill 1.16 that supports file schemas.  
+In Drill 1.17, this option is enabled by default.
 
 **store.table.use_schema_file**  
 Enables the use of the schema file mechanism.
@@ -209,13 +220,23 @@ Values are trimmed when converting to any type, except 
for varchar.
 
 ## Usage Notes 
 
-### General Information  
-- Schema provisioning only works with tables defined as directories because 
Drill must have a place to store the schema file. The directory can contain one 
or more files.  
+### General Information
+- Schema provisioning works only with the file system (dfs-based) storage 
plugins. It works by placing a file `.drill.schema` in the root folder of 
tables defined as a directory. The directory can contain any number of files 
(even just one) in addition to the schema file.
 - Text files must have headers. The default extension for delimited text files 
with headers is `.csvh`. Note that the column names that appear in the headers 
match column definitions in the schema.  
 - You do not have to enumerate all columns in a file when creating a schema. 
You can indicate the columns of interest only.  
 - Columns in the defined schema do not have to be in the same order as in the 
data file.  
 - Column names must match. The case can differ, for example “name” and “NAME” 
are acceptable.   
-- Queries on columns with data types that cannot be converted fail with a 
`DATA_READ_ERROR`.   
+
+Drill is unique in that it infers table schema at runtime. However, sometimes 
schema inference can fail when Drill
+ cannot infer the correct types. For example, Drill treats all fields in a 
text file as text. Drill may not be able
+ to determine the type of fields in JSON files if the fields are missing or 
set to `null` in the first few records
+ in the file. Drill issues a `DATA_READ_ERROR` when runtime schema inference 
fails.
+
+When Drill cannot correctly infer the schema, you can instead use your 
knowledge of the file layout to tell Drill
+ the proper schema to use. Schema provisioning is the feature you use to 
specify the schema.
+ You can provide a schema for the file as a whole using the [`CREATE OR 
REPLACE SCHEMA` command](#syntax) or for
+ a single query using a [table 
function]({{site.baseurl}}/docs/plugin-configuration-basics/#table-function-parameters).
+ Please see [Specifying the Schema as Table Function 
Parameter]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
 for details.
 
 ### Schema Mode (Column Order)  
 The schema mode determines the set of columns returned for wildcard (*) 
queries and the  ordering of those columns. The mode is set through the 
`drill.strict` property. You can set this property to true (strict) or false 
(not strict). If you do not indicate the mode, the default is false (not 
strict).  
@@ -446,7 +467,10 @@ Note that date, time type conversion uses the Joda time 
library, thus the format
 
 
 ## Limitations
-This feature is currently in the alpha phase (preview, experimental) for Drill 
1.16 and only applies to text (CSV) files in this release. You must enable this 
feature through the `exec.storage.enable_v3_text_reader` and 
`store.table.use_schema_file` system/session options.
+Schema provisioning works with selected readers. If you develop a format 
plugin, you must use the
+ `Enhanced Vector Framework` (rather than the "classic" techniques) to enable 
schema support.
+
+To use schema provisioning, you must first enable it with the 
`store.table.use_schema_file` option.
 
 ## Examples
 Examples throughout this topic use the files and directories described in the 
following section, Directory and File Setup.   
@@ -569,7 +593,7 @@ Running EXPLAIN PLAN, you can see that type conversion was 
done while reading da
        00-02        Scan(table=[[dfs, tmp, text_table]], 
groupscan=[EasyGroupScan [selectionRoot=file:/tmp/text_table, numFiles=2, 
columns=[`**`], files=[file:/tmp/text_table/1.csvh, 
file:/tmp/text_table/2.csvh], schema=[TupleSchema [PrimitiveColumnMetadata 
[`id` (INT(0, 0):OPTIONAL)]]]]])  
 
 ### Describing Schema for a Table
-After you create schema, you can examine the schema using the DESCRIBE SCHEMA 
FOR TABLE command. Schema can print to JSON or STATEMENT format. JSON format is 
the default if no format is indicated in the query. Schema displayed in JSON 
format is the same as the JSON format in the `.drill.schema` file.
+You can verify the provided schema using the [`DESCRIBE SCHEMA FOR TABLE` 
command](#related-commands). This command can format the schema in two formats. 
The `JSON` format is the same as the contents of the `.drill.schema` file 
stored in your table directory.
 
        describe schema for table dfs.tmp.`text_table` as JSON;
        
+----------------------------------------------------------------------------------+
@@ -590,7 +614,7 @@ After you create schema, you can examine the schema using 
the DESCRIBE SCHEMA FO
        } |
        
+----------------------------------------------------------------------------------+
 
-STATEMENT format displays the schema in a form compatible with the CREATE OR 
REPLACE SCHEMA command such that it can easily be copied, modified, and 
executed.
+You can also use the `STATEMENT` format to recover the SQL statement to 
recreate the schema. You can easily copy, reuse or edit this statement to 
change the schema or reuse the statement for other files.
 
        describe schema for table dfs.tmp.`text_table` as statement;
        
+--------------------------------------------------------------------------+
@@ -604,8 +628,35 @@ STATEMENT format displays the schema in a form compatible 
with the CREATE OR REP
         |
        
+--------------------------------------------------------------------------+
 
+### Altering Schema for a Table
+Use the `ALTER SCHEMA` command to update your table schema. The command can 
add or replace columns.
+Or, it can update properties for the table or individual columns. Syntax:
+
+    ALTER SCHEMA
+    (FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json')
+    ADD [OR REPLACE]
+    [COLUMNS (col1 int, col2 varchar)]
+    [PROPERTIES ('prop1'='val1', 'prop2'='val2')]
+
+`ALTER SCHEMA` modifies an existing schema file; it will fail if the schema 
file does not exist.
+(Use `CREATE SCHEMA` to create a new schema file.)
+
+To prevent accidental changes, the `ALTER SCHEMA ... ADD` command will fail if 
the requested column or property
+ already exists. Use the `OR REPLACE` clause to modify an existing column or 
property.
+
+You can remove columns or property with the `ALTER SCHEMA ... REMOVE` command:
+
+    ALTER SCHEMA
+    (FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json')
+    REMOVE
+    [COLUMNS (col1 int, col2 varchar)]
+    [PROPERTIES ('prop1'='val1', 'prop2'='val2')]
+
+The command fails if the schema file does not exist. The command silently 
ignores a request to remove a column or
+ property which does not exist.
+
 ### Dropping Schema for a Table
-You can easily drop the schema for a table using the DROP SCHEMA [IF EXISTS] 
FOR TABLE \`table_name` command, as shown:
+You can easily drop the schema for a table using the ``DROP SCHEMA [IF EXISTS] 
FOR TABLE `table_name` `` command, as shown:
 
        use dfs.tmp;
        +------+-------------------------------------+
diff --git a/images/7671b34d6e8a4d050f75278f10f1a08.jpg 
b/images/7671b34d6e8a4d050f75278f10f1a08.jpg
new file mode 100644
index 0000000..ea16a44
Binary files /dev/null and b/images/7671b34d6e8a4d050f75278f10f1a08.jpg differ

[drill] branch gh-pages updated: Update docs for Metastore to point that all format plugins are supported

Reply via email to