[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-20 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r395773086
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -250,6 +282,9 @@ A table can be divided into directories, called 
"partitions". The `PARTITIONS` t
  - Applies to tables stored as Parquet files and only when stored in the `DFS` 
storage plugin.
  - Disabled by default. You must enable this feature through the 
`metastore.enabled` system/session option.
 
+### Limitations of the 1.18 release
+ - Applies to all file system storage plugin formats except for MaprDB.
 
 Review comment:
   Again, since Drill 1.17 is released, should we list Drill 1.17 limitations 
as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-20 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r395772674
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -103,20 +105,50 @@ Schema information and summary statistics also computed 
and stored for table seg
 
 The detailed metadata schema is described 
[here](https://github.com/apache/drill/tree/master/metastore/metastore-api#metastore-tables).
 You can try out the metadata to get a sense of what is available, by using the
- [Inspect the Metastore using `INFORMATION_SCHEMA` 
tables]({{site.baseurl}}/docs/using-drill-metastore/#inspect-the-metastore-using-information_schema-tables)
 tutorial.
+ [Inspect the Metastore using `INFORMATION_SCHEMA` 
tables](#inspect-the-metastore-using-information_schema-tables) tutorial.
 
 Every table described by the Metastore may be a bare file or one or more files 
that reside in one or more directories.
 
 If a table consists of a single directory or file, then it is non-partitioned. 
The single directory can contain any number of files.
 Larger tables tend to have subdirectories. Each subdirectory is a partition 
and such a table are called "partitioned".
-Please refer to [Exposing Drill Metastore metadata through 
`INFORMATION_SCHEMA` 
tables]({{site.baseurl}}/docs/using-drill-metastore/#exposing-drill-metastore-metadata-through-information_schema-tables)
+Please refer to [Exposing Drill Metastore metadata through 
`INFORMATION_SCHEMA` 
tables](#exposing-drill-metastore-metadata-through-information_schema-tables)
  for information, how to query partitions and segments metadata.
 
 A traditional database divides tables into schemas and tables.
 Drill can connect to any number of data sources, each of which may have its 
own schema.
 As a result, the Metastore labels tables with a combination of (plugin 
configuration name, workspace name, table name).
 Note that if before renaming any of these items, you must delete table's 
Metadata entry and recreate it after renaming.
 
+### Using schema provisioning feature with Drill Metastore
+
+The Drill Metastore holds both schema and statistics information for a table. 
The `ANALYZE` command can infer the table
+ schema for well-defined tables (such as many Parquet tables). Some tables are 
too complex or variable for Drill's
+ schema inference to work well. For example, JSON tables often omit fields or 
have long runs of nulls so that Drill
+ cannot determine column types. In these cases, you can specify the correct 
schema based on your knowledge of the
+ table's structure. You specify a schema in the `ANALYZE` command using the 
+ [Schema 
provisioning]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
 syntax.
+
+Please refer to [Provisioning schema for Drill 
Metastore](#provisioning-schema-for-drill-metastore) for examples of usage.
+
+### Schema priority
+
+Drill allows the following ways for providing table schema:
+ - providing schema with table function:
+   - specifying inline schema;
+   - specifying path to the schema file;
+ - using schema file in table root directory;
+ - using schema from Drill Metastore.
+
+The highest priority has schema provided in table function.
+
+Second priority has schema file (if `store.table.use_schema_file` is enabled).
+
+If neither of the above schema sources wasn't specified, schema from Drill 
Metastore will be used.
+
+Regardless of the source of the schema, it will be used and handled in the 
same way.
+
+Table metadata from Drill Metastore will be used if it is available regardless 
of the schema source.
+
 
 Review comment:
   Drill uses metadata during both query planning and execution. Drill gives 
you multiple ways to provide a schema.
   
   When you run the `ANALYZE TABLE` command, Drill will uses the following 
rules for the table schema to be stored in the Metastore. In priority order:
   
   * A schema file, created with `CREATE OR REPLACE SCHEMA`, in the table root 
directory.
   * Schema inferred from file data.
   
   To plan a query, Drill requires information about your file partitions (if 
any) and about row and column cardinality. Drill does not use the provided 
schema for planning as it does not provide this metadata. Instead, at plan time 
Drill obtains metadata from one of the following, again in priority order:
   
   * The Drill Metastore, if available.
   * Inferred from file data. Drill scans the table's directory structure to 
identify partitions. Drill estimates row counts based on the file size. Drill 
uses default estimates for column cardinality.
   
   At query execution time, a schema tells Drill the shape of your data and how 
that data should be converted to Drill's SQL types. Your choices for 
execution-time schema, in priority order, are:
   
   *  With a table function:
  - specify 

[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-20 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r395758027
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -1,14 +1,16 @@
 ---
 title: "Using Drill Metastore"
 parent: "Drill Metastore"
-date: 2020-03-03
+date: 2020-03-17
 ---
 
 Drill 1.17 introduces the Drill Metastore which stores the table schema and 
table statistics. Statistics allow Drill to better create optimal query plans.
 
 The Metastore is a Beta feature; it is subject to change. We encourage you to 
try it and provide feedback.
 Because the Metastore is in Beta, the SQL commands and Metastore formats may 
change in the next release.
-{% include startnote.html %}In Drill 1.17, this feature is supported for 
Parquet tables only and is disabled by default.{% include endnote.html %}
+{% include startnote.html %}In Drill 1.17, this feature is supported for 
Parquet tables only and is disabled by default.
+Starting from Drill 1.18, this feature is supported for all **format** plugins 
except for MaprDB.
+{% include endnote.html %}
 
 Review comment:
   Looks like the note for Drill 1.17 was removed? Should we keep it since 
Drill 1.17 is the currently-released version?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-20 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r395774025
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -209,13 +220,23 @@ Values are trimmed when converting to any type, except 
for varchar.
 
 ## Usage Notes 
 
-### General Information  
-- Schema provisioning only works with tables defined as directories because 
Drill must have a place to store the schema file. The directory can contain one 
or more files.  
+### General Information
+- Schema provisioning is supported only for the file system (dfs-based) 
storage plugins. It works by placing a file `.drill.schema` in the root folder 
of tables defined as a directory. The directory can contain any number of files 
(even just one) in addition to the schema file.
 
 Review comment:
   Schema provisioning works only with the file system...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-20 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r395775651
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -446,7 +467,8 @@ Note that date, time type conversion uses the Joda time 
library, thus the format
 
 
 ## Limitations
-This feature is currently in the alpha phase (preview, experimental) for Drill 
1.16 and only applies to text (CSV) files in this release. You must enable this 
feature through the `exec.storage.enable_v3_text_reader` and 
`store.table.use_schema_file` system/session options.
+This feature applies to format plugins that use the `Enhanced Vector 
Framework`. You must enable this feature through
 
 Review comment:
   ((Only developers know about EVF. Maybe:))
   
   Schema provisioning works with selected readers. If you develop a format 
plugin, you must use the `Enhanced Vector Framework` (rather than the "classic" 
techniques) to enable schema support.
   
   To use schema provisioning, you must first enable it with the ... option.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394567289
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -209,13 +210,15 @@ Values are trimmed when converting to any type, except 
for varchar.
 
 ## Usage Notes 
 
-### General Information  
-- Schema provisioning only works with tables defined as directories because 
Drill must have a place to store the schema file. The directory can contain one 
or more files.  
+### General Information
+- Schema provisioning using schema file works only with tables defined as 
directories because Drill must have a place to store the schema file. The 
directory can contain one or more files.  
 - Text files must have headers. The default extension for delimited text files 
with headers is `.csvh`. Note that the column names that appear in the headers 
match column definitions in the schema.  
 - You do not have to enumerate all columns in a file when creating a schema. 
You can indicate the columns of interest only.  
 - Columns in the defined schema do not have to be in the same order as in the 
data file.  
 - Column names must match. The case can differ, for example “name” and “NAME” 
are acceptable.   
-- Queries on columns with data types that cannot be converted fail with a 
`DATA_READ_ERROR`.   
+- Queries on columns with data types that cannot be converted fail with a 
`DATA_READ_ERROR`.
 
 Review comment:
   Drill is unique in that it infers table schema at runtime. However, 
sometimes schema inference can fail when Drill cannot infer the correct types. 
For example, Drill treats all fields in a text file as text. Drill may not be 
able to determine the type of fields in JSON files if the fields are missing or 
set to `null` in the first few records in the file. Drill issues a 
`DATA_READ_ERROR` when runtime schema inference fails.
   
   When Drill cannot correctly infer the schema, you can instead use your 
knowledge of the file layout to tell Drill the proper schema to use. Schema 
provisioning is the feature you use to specify the schema. You can provide a 
schema for the file as a whole using the `CREATE OR REPLACE SCHEMA` command 
((insert link)) or for a single query using a table function ((insert link)). 
Please see ... for details.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394570531
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -569,7 +573,7 @@ Running EXPLAIN PLAN, you can see that type conversion was 
done while reading da
00-02Scan(table=[[dfs, tmp, text_table]], 
groupscan=[EasyGroupScan [selectionRoot=file:/tmp/text_table, numFiles=2, 
columns=[`**`], files=[file:/tmp/text_table/1.csvh, 
file:/tmp/text_table/2.csvh], schema=[TupleSchema [PrimitiveColumnMetadata 
[`id` (INT(0, 0):OPTIONAL)])  
 
 ### Describing Schema for a Table
-After you create schema, you can examine the schema using the DESCRIBE SCHEMA 
FOR TABLE command. Schema can print to JSON or STATEMENT format. JSON format is 
the default if no format is indicated in the query. Schema displayed in JSON 
format is the same as the JSON format in the `.drill.schema` file.
+After you create schema, you can examine the schema using the `DESCRIBE SCHEMA 
FOR TABLE` command. Schema can print to `JSON` or `STATEMENT` format. `JSON` 
format is the default if no format is indicated in the query. Schema displayed 
in `JSON `format is the same as the `JSON` format in the `.drill.schema` file.
 
 Review comment:
   ((Note: text here is bold. Is this a Github error or a bug in the markdown?))
   
   You can verify the provided schema using the `DESCRIBE SCHEMA FOR TABLE` 
command ((insert link)). This command can format the schema in two formats. The 
`JSON` format is the same as the contents of the `.drill.schema` file stored in 
your table directory.
   
   ((Example here.))
   
   You can also use the `STATEMENT` format to recover the SQL statement to 
recreate the schema. You can easily copy, reuse or edit this statement to 
change the schema or reuse the statement for other files.
   
   ((Example here.))


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394563481
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -209,13 +210,15 @@ Values are trimmed when converting to any type, except 
for varchar.
 
 ## Usage Notes 
 
-### General Information  
-- Schema provisioning only works with tables defined as directories because 
Drill must have a place to store the schema file. The directory can contain one 
or more files.  
+### General Information
+- Schema provisioning using schema file works only with tables defined as 
directories because Drill must have a place to store the schema file. The 
directory can contain one or more files.  
 
 Review comment:
   Schema provisioning is support only for the file system (dfs-based) storage 
plugins. I works by placing a file ((insert name)) in the root folder of tables 
defined as a directory. The directory can contain any number of files (even 
just one) in addition to the schema file.
   
   ((Here, double parens are notes to you, single parens are parts of the 
suggested text.))


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394562214
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -4,9 +4,9 @@ date: 2019-05-31
 parent: "SQL Commands"
 ---
 
-Starting in Drill 1.16, you can define a schema for text files using the 
CREATE OR REPLACE SCHEMA command. Schema is only available for tables 
represented by a directory. To use this feature with a single file, put the 
file inside a directory, and use the directory name to query the table.
+Starting in Drill 1.16, you can define a schema for text files using the 
`CREATE OR REPLACE SCHEMA` command. Such schema is only available for tables 
represented by a directory. To use this feature with a single file, put the 
file inside a directory, and use the directory name to query the table or use 
table function with schema parameter instead.
 
 Review comment:
   Starting in Drill 1.16 you can define a schema for text files. Drill places 
a schema file in the root directory of your text table and so the schema 
feature only works for tables within a directory. If you have a single-file 
table, simply create a directory to hold that file and the schema file.
   
   In Drill 1.17, the provided schema feature is disabled by default. Enable it 
by setting the `store.table.use_schema_file` system/session option to true:
   
   ```
   ALTER SESSION SET `store.table.use_schema_file` = true
   ```

   Next you create the schema using the `CREATE OR REPLACE SCHEMA` command (as 
described where? Please point to an example, or put the example here.)
   
   As described (insert link), you can also use a table function to apply a 
query to individual queries. Or, you can place the table function within a 
view, and query the table through the view. (Would be good to have examples of 
these also.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394557058
 
 

 ##
 File path: _docs/query-data/query-a-file-system/009-querying-avro-files.md
 ##
 @@ -3,5 +3,30 @@ title: "Querying Avro Files"
 date: 2019-04-16
 parent: "Querying a File System"
 ---
-  
-The Avro format is experimental at this time. There are known issues when 
querying Avro files.  
+
+Drill provides functionality to query [Avro](https://avro.apache.org/) files.
+
+Starting from Drill 1.18, Avro file format supports [Schema 
provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes) 
feature.
+
+ Preparing example data
+
+Download the following [sample data 
file](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/avro/map_string_to_long.avro)
+and place it to the `/tmp/` folder to follow the example below.
 
 Review comment:
   To follow along with this example, download ... to your `/tmp` directory.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394553419
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -442,3 +455,114 @@ apache drill (information_schema)> SELECT * FROM 
INFORMATION_SCHEMA.`COLUMNS` WH
 
+---+--++-+--++-+---+--++---+-+---++---++-+---+---+-+-+---+---+---+
 17 rows selected (0.183 seconds)
 ```
+
+### Provisioning schema for Drill Metastore
+
+ Directory and File Setup
+
+Set up storage plugin for desired file system, as described here:
+ [Connecting Drill to a File 
System]({{site.baseurl}}/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system).
 
 Review comment:
   Ensure you have configured the file system storage plugin as described here: 
...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394576664
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible 
with the CREATE OR REP
 |

+--+
 
+### Altering Schema for a Table
+Table schema may be updated using the `ALTER SCHEMA` commands.
+
+The syntax for the command to add (or replace) columns / properties is the 
following:
+
+ALTER SCHEMA
+(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json')
+ADD [OR REPLACE]
+[COLUMNS (col1 int, col2 varchar)]
+[PROPERTIES ('prop1'='val1', 'prop2'='val2')]
+
+Add command will fail if column or property with the same name exists, unless 
`OR REPLACE` keywords are indicated.
 
 Review comment:
   `ALTER SCHEMA` modifies an existing schema file; it will fail if the schema 
file does not exist. (Use `CREATE SCHEMA` to create a new schema file.)
   
   To prevent accidental changes, the `ALTER SCHEMA ... ADD` command will fail 
if the requested column or property already exists. Use the `OR REPLACE` clause 
to modify an existing column or property.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394559182
 
 

 ##
 File path: 
_docs/sql-reference/sql-commands/007-analyze-table-refresh-metadata.md
 ##
 @@ -34,10 +34,16 @@ The name of the table or directory for which Drill will 
collect table metadata.
 Table function parameters. This syntax is only available since Drill 1.18.
 Example of table function parameters usage:
 
-table(dfs.`table_name` (type => 'parquet', autoCorrectCorruptDates => 
true))
+ table(dfs.tmp.`text_nation` (type=>'text', fieldDelimiter=>',', 
extractHeader=>true,
+schema=>'inline=(
+`n_nationkey` INT not null,
+`n_name` VARCHAR not null,
+`n_regionkey` INT not null,
+`n_comment` VARCHAR not null)'
+))
 
 For detailed information, please refer to
- [Using the Formats Attributes as Table Function 
Parameters]({{site.baseurl}}/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters)
+ [Specifying the Schema as Table Function 
Parameter]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
 
 Review comment:
   Please refer to ... for the details.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394556426
 
 

 ##
 File path: _docs/query-data/query-a-file-system/009-querying-avro-files.md
 ##
 @@ -3,5 +3,30 @@ title: "Querying Avro Files"
 date: 2019-04-16
 parent: "Querying a File System"
 ---
-  
-The Avro format is experimental at this time. There are known issues when 
querying Avro files.  
+
+Drill provides functionality to query [Avro](https://avro.apache.org/) files.
 
 Review comment:
   Drill supports files in the [Avro](https://avro.apache.org/) format. 
Starting from Drill 1.18, the Avro  format supports the [Schema 
provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes) 
feature.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394576942
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible 
with the CREATE OR REP
 |

+--+
 
+### Altering Schema for a Table
+Table schema may be updated using the `ALTER SCHEMA` commands.
+
+The syntax for the command to add (or replace) columns / properties is the 
following:
+
+ALTER SCHEMA
+(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json')
+ADD [OR REPLACE]
+[COLUMNS (col1 int, col2 varchar)]
+[PROPERTIES ('prop1'='val1', 'prop2'='val2')]
+
+Add command will fail if column or property with the same name exists, unless 
`OR REPLACE` keywords are indicated.
+Add command will fail, if the schema file does not exist.
+
+The syntax for the command to remove columns / properties is the following:
 
 Review comment:
   You can remove columns or property with the `ALTER SCHEMA ... REMOVE` 
command:


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394558865
 
 

 ##
 File path: _docs/query-data/query-a-file-system/009-querying-avro-files.md
 ##
 @@ -3,5 +3,30 @@ title: "Querying Avro Files"
 date: 2019-04-16
 parent: "Querying a File System"
 ---
-  
-The Avro format is experimental at this time. There are known issues when 
querying Avro files.  
+
+Drill provides functionality to query [Avro](https://avro.apache.org/) files.
+
+Starting from Drill 1.18, Avro file format supports [Schema 
provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes) 
feature.
+
+ Preparing example data
+
+Download the following [sample data 
file](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/avro/map_string_to_long.avro)
+and place it to the `/tmp/` folder to follow the example below.
+
+ Selecting data from Avro files
+
+To view the data in the `map_string_to_long.avro` file, issue the following 
query:
 
 Review comment:
   We can query all data from the `map_string_to_long.avro` file to see (what?)
   
   (Are we showing schema provisioning? Where did we create the schema file? 
Suggestion: show the file without a schema file. Identify the problem we want 
to fix. Create the schema file and do the query again, showing how we fixed the 
problem. As it is, as I tried to reword the sentence, I realized I'm not 
entirely clear what we're showing.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394577436
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible 
with the CREATE OR REP
 |

+--+
 
+### Altering Schema for a Table
+Table schema may be updated using the `ALTER SCHEMA` commands.
+
+The syntax for the command to add (or replace) columns / properties is the 
following:
+
+ALTER SCHEMA
+(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json')
+ADD [OR REPLACE]
+[COLUMNS (col1 int, col2 varchar)]
+[PROPERTIES ('prop1'='val1', 'prop2'='val2')]
+
+Add command will fail if column or property with the same name exists, unless 
`OR REPLACE` keywords are indicated.
+Add command will fail, if the schema file does not exist.
+
+The syntax for the command to remove columns / properties is the following:
+
+ALTER SCHEMA
+(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json')
+REMOVE
+[COLUMNS (col1 int, col2 varchar)]
+[PROPERTIES ('prop1'='val1', 'prop2'='val2')]
+
+Remove command won't fail if the column or property does not exist but will 
fail if the schema file is absent.
 
 Review comment:
   The command fails if the schema file does not exist. The command silently 
ignores a request to remove a column or property which does not exist.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394552697
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -117,6 +119,14 @@ Drill can connect to any number of data sources, each of 
which may have its own
 As a result, the Metastore labels tables with a combination of (plugin 
configuration name, workspace name, table name).
 Note that if before renaming any of these items, you must delete table's 
Metadata entry and recreate it after renaming.
 
+### Using schema provisioning feature with Drill Metastore
+
+Drill Metastore allows specifying schema using the same syntax as
+ [Schema 
provisioning]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
 feature when used as a table function.
+User can specify table schema in the `ANALYZE` command, so it will be used for 
collecting table statistics and will be stored
+ to Drill Metastore to be used when submitting queries for this table similar 
to the case when user specifies schema
+ explicitly in the table function.
 
 Review comment:
   The Drill Metastore holds both schema and statistics information for a 
table. The `ANALYZE` command can infer the table schema for well-defined tables 
(such as many Parquet tables). Some tables are too complex or variable for 
Drill's schema inference to work well. For example, JSON tables often omit 
fields or have long runs of nulls so that Drill cannot determine column types. 
In these cases you can specify the correct schema based on your knowledge of 
the a table's structure. You specify a schema in the `ANALYZE` command using 
the 
[Schema 
provisioning]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
 syntax.
   
   (Please provide an example.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394537808
 
 

 ##
 File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md
 ##
 @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))``
 
 For more information about format plugin configuration see ["Text Files: CSV, 
TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/).  
 
+## Specifying the Schema as Table Function Parameter
+
+Starting from Drill 1.17, table schema may be indicated in the query using 
table function.
+
+It is useful when the user does not want to persist schema in table root 
location or when reading from file, not folder.
+Schema parameter can be used as an individual unit or together with format 
plugin table properties.
+
+Schema can be provided in the `SCHEMA` property inline or using the file.
+
+The syntax for inline schema is similar to the [CREATE OR REPLACE 
SCHEMA]({{site.baseurl}}/docs/create-or-replace-schema/#syntax):
+
+```
+SELECT a, b FROM TABLE (table_name(
+SCHEMA => 'inline=(column_name data_type [nullability] [format] [default] 
[properties {prop='val', ...})]'))
+```
+
+Example of usage:
+
+```
+select * from table(dfs.tmp.`text_table`(
+schema => 'inline=(col1 date properties {`drill.format` = `-MM-dd`}) 
+properties {`drill.strict` = `false`}'))
+```
+
+The syntax for indicating schema using the path:
+
+```
+select * from table(dfs.tmp.`text_table`(schema => 'path=`/tmp/my_schema`'))
+```
+
+The following example demonstrates applying provided schema alongside with 
format plugin table function parameters.
+Assuming that the user has CSV file with headers with extension that does not 
comply to a default text file with headers extension (ex: `cars.csvh-test`):
 
 Review comment:
   Suppose that you have a CSV file with headers and with a custom extension: 
`csvh-test`. You can combine the schema with format plugin properties:


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394544903
 
 

 ##
 File path: 
_docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md
 ##
 @@ -49,54 +49,53 @@ fileSystemMetadata|true|Set to true to extract filesystem 
metadata including the
 descriptive|true|Set to true to extract metadata in a human-readable string 
format. Set false to extract metadata in a machine-readable typed format.
 timeZone|null|Specify the time zone to interpret the timestamp with no time 
zone information. If the timestamp includes the time zone information, this 
value is ignored. If null is set, the local time zone is used.  
 
-##Examples  
+## Examples  
+
+Download the following image and place it to the `/tmp` folder to follow the 
examples.
 
 Review comment:
   To follow along with the examples, start by downloading the following image 
to your `\tmp` directory.
   
   (The documentation seems more friendly and approachable if we address the 
user directly and avoid the passive voice.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394548649
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -1,14 +1,16 @@
 ---
 title: "Using Drill Metastore"
 parent: "Drill Metastore"
-date: 2020-03-03
+date: 2020-03-17
 ---
 
 Drill 1.17 introduces the Drill Metastore which stores the table schema and 
table statistics. Statistics allow Drill to better create optimal query plans.
 
 The Metastore is a Beta feature; it is subject to change. We encourage you to 
try it and provide feedback.
 Because the Metastore is in Beta, the SQL commands and Metastore formats may 
change in the next release.
-{% include startnote.html %}In Drill 1.17, this feature is supported for 
Parquet tables only and is disabled by default.{% include endnote.html %}
+{% include startnote.html %}In Drill 1.17, this feature is supported for 
Parquet tables only and is disabled by default.
+Starting from Drill 1.18, this feature is supported for all **format** plugins 
except for MaprDB.
+{% include endnote.html %}
 
 Review comment:
   The Metastore is a beta feature and is subject to change. In particular, the 
SQL commands and Metastore format may change based on your experience and 
feedback.
   * In Drill 1.17, Metastore supports only tables in Parquet format. The 
feature is disabled by default.
   * In Drill 1.18, Metastore supports all format plugins (except MaprDB) for 
the file system plugin. The feature is still disabled by default.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394536560
 
 

 ##
 File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md
 ##
 @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))``
 
 For more information about format plugin configuration see ["Text Files: CSV, 
TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/).  
 
+## Specifying the Schema as Table Function Parameter
+
+Starting from Drill 1.17, table schema may be indicated in the query using 
table function.
+
+It is useful when the user does not want to persist schema in table root 
location or when reading from file, not folder.
+Schema parameter can be used as an individual unit or together with format 
plugin table properties.
+
+Schema can be provided in the `SCHEMA` property inline or using the file.
+
+The syntax for inline schema is similar to the [CREATE OR REPLACE 
SCHEMA]({{site.baseurl}}/docs/create-or-replace-schema/#syntax):
+
+```
+SELECT a, b FROM TABLE (table_name(
+SCHEMA => 'inline=(column_name data_type [nullability] [format] [default] 
[properties {prop='val', ...})]'))
+```
+
+Example of usage:
+
+```
+select * from table(dfs.tmp.`text_table`(
+schema => 'inline=(col1 date properties {`drill.format` = `-MM-dd`}) 
+properties {`drill.strict` = `false`}'))
+```
+
+The syntax for indicating schema using the path:
 
 Review comment:
   Alternatively, can also specify the path to a schema file. For example:


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394544334
 
 

 ##
 File path: _docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md
 ##
 @@ -142,4 +153,20 @@ You may need to qualify a table name with a schema name 
for Drill to return data
| 2 | 1.2.3.5  |
+---+--+
 
+### Example of Postgres Configuration with `sourceParameters` configuration 
property
 
+{
+  type: "jdbc",
+  enabled: true,
+  driver: "org.postgresql.Driver",
+  url:"jdbc:postgresql://1.2.3.4/mydatabase?defaultRowFetchSize=2",
+  username:"user",
+  password:"password",
 
 Review comment:
   (Nit: for display formatting, please include a space after the colon.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394553793
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -442,3 +455,114 @@ apache drill (information_schema)> SELECT * FROM 
INFORMATION_SCHEMA.`COLUMNS` WH
 
+---+--++-+--++-+---+--++---+-+---++---++-+---+---+-+-+---+---+---+
 17 rows selected (0.183 seconds)
 ```
+
+### Provisioning schema for Drill Metastore
+
+ Directory and File Setup
+
+Set up storage plugin for desired file system, as described here:
+ [Connecting Drill to a File 
System]({{site.baseurl}}/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system).
+
+Set `store.format` to `csvh`:
+
+```
+SET `store.format`='csvh';
++--+---+
+|  ok  |summary|
++--+---+
+| true | store.format updated. |
++--+---+
+```
+
+Create text table based on the sample `/tpch/nation.parquet` table from `cp` 
plugin:
 
 Review comment:
   Create a text table...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394538417
 
 

 ##
 File path: _docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md
 ##
 @@ -9,14 +9,25 @@ As with any source, Drill supports joins within and between 
all systems. Drill a
 
 ## Using the RDBMS Storage Plugin
 
-Drill is designed to work with any relational datastore that provides a JDBC 
driver. Drill is actively tested with Postgres, MySQL, Oracle, MSSQL and Apache 
Derby. For each system, you will follow three basic steps for setup:
+Drill is designed to work with any relational datastore that provides a JDBC 
driver. Drill is actively tested with
+ Postgres, MySQL, Oracle, MSSQL, Apache Derby and H2. For each system, you 
will follow three basic steps for setup:
 
   1. [Install Drill]({{ site.baseurl 
}}/docs/installing-drill-in-embedded-mode), if you do not already have it 
installed.
   2. Copy your database's JDBC driver into the jars/3rdparty directory. 
(You'll need to do this on every node.)  
   3. Restart Drill. See [Starting Drill in Distributed 
Mode]({{site.baseurl}}/docs/starting-drill-in-distributed-mode/).
-  4. Add a new storage configuration to Drill through the Web UI. Example 
configurations for [Oracle](#Example-Oracle-Configuration), [SQL 
Server](#Example-SQL-Server-Configuration), 
[MySQL](#Example-MySQL-Configuration) and 
[Postgres](#Example-Postgres-Configuration) are provided below.
-  
-**Example: Working with MySQL**
+  4. Add a new storage configuration to Drill through the Web UI. Example 
configurations for [Oracle](#example-oracle-configuration), [SQL 
Server](#example-sql-server-configuration), 
[MySQL](#example-mysql-configuration) and 
[Postgres](#example-postgres-configuration) are provided below.
+
+## Setting data source parameters in the storage plugin configuration
+
+Starting from Drill 1.18.0, new JDBC storage plugin configuration property 
`sourceParameters` was introduced to allow
 
 Review comment:
   Drill's JDBC storage plugin configuration allows you to specify database 
parameters as JSON key/value pairs. Drill 1.18 introduced a new JDBC storage 
plugin property called `sourceParameters` to handle query parameter names which 
are not valid JSON identifiers. See  
[HikariCP](https://github.com/brettwooldridge/HikariCP#configuration-knobs-baby)
 for details. See [Example of Postgres Configuration with `sourceParameters` 
configuration 
property](#example-of-postgres-configuration-with-sourceparameters-configuration-property)
   for an example.
   
   (Note: please specify which parameters we're talking about. I made up the 
"database parameter" part; please replace with an accurate description.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394535049
 
 

 ##
 File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md
 ##
 @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))``
 
 For more information about format plugin configuration see ["Text Files: CSV, 
TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/).  
 
+## Specifying the Schema as Table Function Parameter
+
+Starting from Drill 1.17, table schema may be indicated in the query using 
table function.
+
+It is useful when the user does not want to persist schema in table root 
location or when reading from file, not folder.
+Schema parameter can be used as an individual unit or together with format 
plugin table properties.
 
 Review comment:
   (Combine four paragraphs.) Table schemas normally reside in the root folder 
of each table. You an also specify a schema for an individual query using a 
table function and specifying the `SCHEMA` property. You can combine the schema 
with format plugin properties. The syntax is similar...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394536724
 
 

 ##
 File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md
 ##
 @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))``
 
 For more information about format plugin configuration see ["Text Files: CSV, 
TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/).  
 
+## Specifying the Schema as Table Function Parameter
+
+Starting from Drill 1.17, table schema may be indicated in the query using 
table function.
+
+It is useful when the user does not want to persist schema in table root 
location or when reading from file, not folder.
+Schema parameter can be used as an individual unit or together with format 
plugin table properties.
+
+Schema can be provided in the `SCHEMA` property inline or using the file.
+
+The syntax for inline schema is similar to the [CREATE OR REPLACE 
SCHEMA]({{site.baseurl}}/docs/create-or-replace-schema/#syntax):
+
+```
+SELECT a, b FROM TABLE (table_name(
+SCHEMA => 'inline=(column_name data_type [nullability] [format] [default] 
[properties {prop='val', ...})]'))
+```
+
+Example of usage:
 
 Review comment:
   You can specify the schema inline within the query. For example:


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394555748
 
 

 ##
 File path: 
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
 ##
 @@ -442,3 +455,114 @@ apache drill (information_schema)> SELECT * FROM 
INFORMATION_SCHEMA.`COLUMNS` WH
 
+---+--++-+--++-+---+--++---+-+---++---++-+---+---+-+-+---+---+---+
 17 rows selected (0.183 seconds)
 ```
+
+### Provisioning schema for Drill Metastore
+
+ Directory and File Setup
+
+Set up storage plugin for desired file system, as described here:
+ [Connecting Drill to a File 
System]({{site.baseurl}}/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system).
+
+Set `store.format` to `csvh`:
+
+```
+SET `store.format`='csvh';
++--+---+
+|  ok  |summary|
++--+---+
+| true | store.format updated. |
++--+---+
+```
+
+Create text table based on the sample `/tpch/nation.parquet` table from `cp` 
plugin:
+
+```
+create table dfs.tmp.text_nation as (select * from cp.`/tpch/nation.parquet`);
++--+---+
+| Fragment | Number of records written |
++--+---+
+| 0_0  | 25|
++--+---+
+```
+
+Query the table `text_nation`:
+
+```
+SELECT count(*) FROM dfs.tmp.`text_nation`;
+++
+| EXPR$0 |
+++
+| 25 |
+++
+```
 
 Review comment:
   (Suggestion: since we are applying a schema, show the original types using 
the clunky `typeof()` functions. This will show that the columns start as 
`VARCHAR`, but that applying the schema gives them more useful types. 
Otherwise, I think the point may be lost on most users.
   
   And, yes, we should have a `DESCRIBE TABLE` to do the job instead of `SELECT 
typeof(n_nationkey), typeof(...`)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394545980
 
 

 ##
 File path: 
_docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md
 ##
 @@ -49,54 +49,53 @@ fileSystemMetadata|true|Set to true to extract filesystem 
metadata including the
 descriptive|true|Set to true to extract metadata in a human-readable string 
format. Set false to extract metadata in a machine-readable typed format.
 timeZone|null|Specify the time zone to interpret the timestamp with no time 
zone information. If the timestamp includes the time zone information, this 
value is ignored. If null is set, the local time zone is used.  
 
-##Examples  
+## Examples  
+
+Download the following image and place it to the `/tmp` folder to follow the 
examples.
+
+[![image]({{ site.baseurl }}/images/7671b34d6e8a4d050f75278f10f1a08.jpg)]({{ 
site.baseurl }}/images/7671b34d6e8a4d050f75278f10f1a08.jpg)
 
 A Drill query on a JPEG file with the property descriptive: true
 
-   0: jdbc:drill:zk=local> select FileName, * from 
dfs.`4349313028_f69ffa0257_o.jpg`;  
-   
+--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
-   | FileName | FileSize | FileDateTime | Format | PixelWidth | 
PixelHeight | BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | 
HasAlpha | Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | 
AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | GPS 
| ExifThumbnail | Photoshop | IPTC | Huffman | FileType |
-   
+--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
-   | 4349313028_f69ffa0257_o.jpg | 257213 bytes | Fri Mar 09 12:09:34 
+08:00 2018 | JPEG | 1199 | 800 | 24 | 96 | 96 | Unknown (0) | RGB | false | 
00:00:00 | Unknown | 0 | Unknown | 0 | 0 | 
{"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"800 
pixels","ImageWidth":"1199 pixels","NumberOfComponents":"3","Component1":"Y 
component: Quantization table 0, Sampling factors 2 horiz/2 
vert","Component2":"Cb component: Quantization table 1, Sampling factors 1 
horiz/1 vert","Component3":"Cr component: Quantization table 1, Sampling 
factors 1 horiz/1 vert"} | 
{"Version":"1.1","ResolutionUnits":"inch","XResolution":"96 
dots","YResolution":"96 
dots","ThumbnailWidthPixels":"0","ThumbnailHeightPixels":"0"} | 
{"Software":"Picasa 3.0"} | 
{"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} | 
{"InteroperabilityIndex":"Unknown ()","InteroperabilityVersion":"1.00"} | 
{"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32' 
15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2' 
6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} | 
{"Compression":"JPEG (old-style)","XResolution":"72 dots per 
inch","YResolution":"72 dots per 
inch","ResolutionUnit":"Inch","ThumbnailOffset":"414 
bytes","ThumbnailLength":"7213 bytes"} | {} | 
{"Keywords":"135;2002;issaquah;police car;wa;washington"} | 
{"NumberOfTables":"4 Huffman tables"} | 
{"DetectedFileTypeName":"JPEG","DetectedFileTypeLongName":"Joint Photographic 
Experts 
Group","DetectedMIMEType":"image/jpeg","ExpectedFileNameExtension":"jpg"} |
-   
+--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
- 
+select FileName, * from dfs.tmp.`7671b34d6e8a4d050f75278f10f1a08.jpg`;
+

[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394572017
 
 

 ##
 File path: _docs/sql-reference/sql-commands/021-create-schema.md
 ##
 @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible 
with the CREATE OR REP
 |

+--+
 
+### Altering Schema for a Table
+Table schema may be updated using the `ALTER SCHEMA` commands.
 
 Review comment:
   Use the `ALTER SCHEMA` command to update your table schema. The command can 
add or replace columns. Or, it can update properties for the table or 
individual columns. Syntax:
   
   ((Replaces next sentence also.))


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported

2020-03-18 Thread GitBox
paul-rogers commented on a change in pull request #2030: Update docs for 
Metastore to point that all format plugins are supported
URL: https://github.com/apache/drill/pull/2030#discussion_r394543770
 
 

 ##
 File path: _docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md
 ##
 @@ -101,9 +112,9 @@ For MySQL, Drill has been tested with MySQL's 
[mysql-connector-java-5.1.37-bin.j
   password:"password"
 }  
 
-**Example Postgres Configuration**
+### Example Postgres Configuration
 
-For Postgres, Drill has been tested with Postgres's 
[9.1-901-1.jdbc4](http://central.maven.org/maven2/org/postgresql/postgresql/) 
driver (any recent driver should work). Copy this driver file to all nodes.
+For Postgres, Drill has been tested with Postgres's 
[42.2.11](https://mvnrepository.com/artifact/org.postgresql/postgresql) driver 
(any recent driver should work). Copy this driver file to all nodes.
 
 Review comment:
   Drill is tested with the Postgres driver version ... Copy this driver jar 
(?) to the (which?) folder on all nodes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services