[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r395773086 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -250,6 +282,9 @@ A table can be divided into directories, called "partitions". The `PARTITIONS` t - Applies to tables stored as Parquet files and only when stored in the `DFS` storage plugin. - Disabled by default. You must enable this feature through the `metastore.enabled` system/session option. +### Limitations of the 1.18 release + - Applies to all file system storage plugin formats except for MaprDB. Review comment: Again, since Drill 1.17 is released, should we list Drill 1.17 limitations as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r395772674 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -103,20 +105,50 @@ Schema information and summary statistics also computed and stored for table seg The detailed metadata schema is described [here](https://github.com/apache/drill/tree/master/metastore/metastore-api#metastore-tables). You can try out the metadata to get a sense of what is available, by using the - [Inspect the Metastore using `INFORMATION_SCHEMA` tables]({{site.baseurl}}/docs/using-drill-metastore/#inspect-the-metastore-using-information_schema-tables) tutorial. + [Inspect the Metastore using `INFORMATION_SCHEMA` tables](#inspect-the-metastore-using-information_schema-tables) tutorial. Every table described by the Metastore may be a bare file or one or more files that reside in one or more directories. If a table consists of a single directory or file, then it is non-partitioned. The single directory can contain any number of files. Larger tables tend to have subdirectories. Each subdirectory is a partition and such a table are called "partitioned". -Please refer to [Exposing Drill Metastore metadata through `INFORMATION_SCHEMA` tables]({{site.baseurl}}/docs/using-drill-metastore/#exposing-drill-metastore-metadata-through-information_schema-tables) +Please refer to [Exposing Drill Metastore metadata through `INFORMATION_SCHEMA` tables](#exposing-drill-metastore-metadata-through-information_schema-tables) for information, how to query partitions and segments metadata. A traditional database divides tables into schemas and tables. Drill can connect to any number of data sources, each of which may have its own schema. As a result, the Metastore labels tables with a combination of (plugin configuration name, workspace name, table name). Note that if before renaming any of these items, you must delete table's Metadata entry and recreate it after renaming. +### Using schema provisioning feature with Drill Metastore + +The Drill Metastore holds both schema and statistics information for a table. The `ANALYZE` command can infer the table + schema for well-defined tables (such as many Parquet tables). Some tables are too complex or variable for Drill's + schema inference to work well. For example, JSON tables often omit fields or have long runs of nulls so that Drill + cannot determine column types. In these cases, you can specify the correct schema based on your knowledge of the + table's structure. You specify a schema in the `ANALYZE` command using the + [Schema provisioning]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter) syntax. + +Please refer to [Provisioning schema for Drill Metastore](#provisioning-schema-for-drill-metastore) for examples of usage. + +### Schema priority + +Drill allows the following ways for providing table schema: + - providing schema with table function: + - specifying inline schema; + - specifying path to the schema file; + - using schema file in table root directory; + - using schema from Drill Metastore. + +The highest priority has schema provided in table function. + +Second priority has schema file (if `store.table.use_schema_file` is enabled). + +If neither of the above schema sources wasn't specified, schema from Drill Metastore will be used. + +Regardless of the source of the schema, it will be used and handled in the same way. + +Table metadata from Drill Metastore will be used if it is available regardless of the schema source. + Review comment: Drill uses metadata during both query planning and execution. Drill gives you multiple ways to provide a schema. When you run the `ANALYZE TABLE` command, Drill will uses the following rules for the table schema to be stored in the Metastore. In priority order: * A schema file, created with `CREATE OR REPLACE SCHEMA`, in the table root directory. * Schema inferred from file data. To plan a query, Drill requires information about your file partitions (if any) and about row and column cardinality. Drill does not use the provided schema for planning as it does not provide this metadata. Instead, at plan time Drill obtains metadata from one of the following, again in priority order: * The Drill Metastore, if available. * Inferred from file data. Drill scans the table's directory structure to identify partitions. Drill estimates row counts based on the file size. Drill uses default estimates for column cardinality. At query execution time, a schema tells Drill the shape of your data and how that data should be converted to Drill's SQL types. Your choices for execution-time schema, in priority order, are: * With a table function: - specify
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r395758027 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -1,14 +1,16 @@ --- title: "Using Drill Metastore" parent: "Drill Metastore" -date: 2020-03-03 +date: 2020-03-17 --- Drill 1.17 introduces the Drill Metastore which stores the table schema and table statistics. Statistics allow Drill to better create optimal query plans. The Metastore is a Beta feature; it is subject to change. We encourage you to try it and provide feedback. Because the Metastore is in Beta, the SQL commands and Metastore formats may change in the next release. -{% include startnote.html %}In Drill 1.17, this feature is supported for Parquet tables only and is disabled by default.{% include endnote.html %} +{% include startnote.html %}In Drill 1.17, this feature is supported for Parquet tables only and is disabled by default. +Starting from Drill 1.18, this feature is supported for all **format** plugins except for MaprDB. +{% include endnote.html %} Review comment: Looks like the note for Drill 1.17 was removed? Should we keep it since Drill 1.17 is the currently-released version? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r395774025 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -209,13 +220,23 @@ Values are trimmed when converting to any type, except for varchar. ## Usage Notes -### General Information -- Schema provisioning only works with tables defined as directories because Drill must have a place to store the schema file. The directory can contain one or more files. +### General Information +- Schema provisioning is supported only for the file system (dfs-based) storage plugins. It works by placing a file `.drill.schema` in the root folder of tables defined as a directory. The directory can contain any number of files (even just one) in addition to the schema file. Review comment: Schema provisioning works only with the file system... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r395775651 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -446,7 +467,8 @@ Note that date, time type conversion uses the Joda time library, thus the format ## Limitations -This feature is currently in the alpha phase (preview, experimental) for Drill 1.16 and only applies to text (CSV) files in this release. You must enable this feature through the `exec.storage.enable_v3_text_reader` and `store.table.use_schema_file` system/session options. +This feature applies to format plugins that use the `Enhanced Vector Framework`. You must enable this feature through Review comment: ((Only developers know about EVF. Maybe:)) Schema provisioning works with selected readers. If you develop a format plugin, you must use the `Enhanced Vector Framework` (rather than the "classic" techniques) to enable schema support. To use schema provisioning, you must first enable it with the ... option. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394567289 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -209,13 +210,15 @@ Values are trimmed when converting to any type, except for varchar. ## Usage Notes -### General Information -- Schema provisioning only works with tables defined as directories because Drill must have a place to store the schema file. The directory can contain one or more files. +### General Information +- Schema provisioning using schema file works only with tables defined as directories because Drill must have a place to store the schema file. The directory can contain one or more files. - Text files must have headers. The default extension for delimited text files with headers is `.csvh`. Note that the column names that appear in the headers match column definitions in the schema. - You do not have to enumerate all columns in a file when creating a schema. You can indicate the columns of interest only. - Columns in the defined schema do not have to be in the same order as in the data file. - Column names must match. The case can differ, for example “name” and “NAME” are acceptable. -- Queries on columns with data types that cannot be converted fail with a `DATA_READ_ERROR`. +- Queries on columns with data types that cannot be converted fail with a `DATA_READ_ERROR`. Review comment: Drill is unique in that it infers table schema at runtime. However, sometimes schema inference can fail when Drill cannot infer the correct types. For example, Drill treats all fields in a text file as text. Drill may not be able to determine the type of fields in JSON files if the fields are missing or set to `null` in the first few records in the file. Drill issues a `DATA_READ_ERROR` when runtime schema inference fails. When Drill cannot correctly infer the schema, you can instead use your knowledge of the file layout to tell Drill the proper schema to use. Schema provisioning is the feature you use to specify the schema. You can provide a schema for the file as a whole using the `CREATE OR REPLACE SCHEMA` command ((insert link)) or for a single query using a table function ((insert link)). Please see ... for details. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394570531 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -569,7 +573,7 @@ Running EXPLAIN PLAN, you can see that type conversion was done while reading da 00-02Scan(table=[[dfs, tmp, text_table]], groupscan=[EasyGroupScan [selectionRoot=file:/tmp/text_table, numFiles=2, columns=[`**`], files=[file:/tmp/text_table/1.csvh, file:/tmp/text_table/2.csvh], schema=[TupleSchema [PrimitiveColumnMetadata [`id` (INT(0, 0):OPTIONAL)]) ### Describing Schema for a Table -After you create schema, you can examine the schema using the DESCRIBE SCHEMA FOR TABLE command. Schema can print to JSON or STATEMENT format. JSON format is the default if no format is indicated in the query. Schema displayed in JSON format is the same as the JSON format in the `.drill.schema` file. +After you create schema, you can examine the schema using the `DESCRIBE SCHEMA FOR TABLE` command. Schema can print to `JSON` or `STATEMENT` format. `JSON` format is the default if no format is indicated in the query. Schema displayed in `JSON `format is the same as the `JSON` format in the `.drill.schema` file. Review comment: ((Note: text here is bold. Is this a Github error or a bug in the markdown?)) You can verify the provided schema using the `DESCRIBE SCHEMA FOR TABLE` command ((insert link)). This command can format the schema in two formats. The `JSON` format is the same as the contents of the `.drill.schema` file stored in your table directory. ((Example here.)) You can also use the `STATEMENT` format to recover the SQL statement to recreate the schema. You can easily copy, reuse or edit this statement to change the schema or reuse the statement for other files. ((Example here.)) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394563481 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -209,13 +210,15 @@ Values are trimmed when converting to any type, except for varchar. ## Usage Notes -### General Information -- Schema provisioning only works with tables defined as directories because Drill must have a place to store the schema file. The directory can contain one or more files. +### General Information +- Schema provisioning using schema file works only with tables defined as directories because Drill must have a place to store the schema file. The directory can contain one or more files. Review comment: Schema provisioning is support only for the file system (dfs-based) storage plugins. I works by placing a file ((insert name)) in the root folder of tables defined as a directory. The directory can contain any number of files (even just one) in addition to the schema file. ((Here, double parens are notes to you, single parens are parts of the suggested text.)) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394562214 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -4,9 +4,9 @@ date: 2019-05-31 parent: "SQL Commands" --- -Starting in Drill 1.16, you can define a schema for text files using the CREATE OR REPLACE SCHEMA command. Schema is only available for tables represented by a directory. To use this feature with a single file, put the file inside a directory, and use the directory name to query the table. +Starting in Drill 1.16, you can define a schema for text files using the `CREATE OR REPLACE SCHEMA` command. Such schema is only available for tables represented by a directory. To use this feature with a single file, put the file inside a directory, and use the directory name to query the table or use table function with schema parameter instead. Review comment: Starting in Drill 1.16 you can define a schema for text files. Drill places a schema file in the root directory of your text table and so the schema feature only works for tables within a directory. If you have a single-file table, simply create a directory to hold that file and the schema file. In Drill 1.17, the provided schema feature is disabled by default. Enable it by setting the `store.table.use_schema_file` system/session option to true: ``` ALTER SESSION SET `store.table.use_schema_file` = true ``` Next you create the schema using the `CREATE OR REPLACE SCHEMA` command (as described where? Please point to an example, or put the example here.) As described (insert link), you can also use a table function to apply a query to individual queries. Or, you can place the table function within a view, and query the table through the view. (Would be good to have examples of these also.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394557058 ## File path: _docs/query-data/query-a-file-system/009-querying-avro-files.md ## @@ -3,5 +3,30 @@ title: "Querying Avro Files" date: 2019-04-16 parent: "Querying a File System" --- - -The Avro format is experimental at this time. There are known issues when querying Avro files. + +Drill provides functionality to query [Avro](https://avro.apache.org/) files. + +Starting from Drill 1.18, Avro file format supports [Schema provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes) feature. + + Preparing example data + +Download the following [sample data file](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/avro/map_string_to_long.avro) +and place it to the `/tmp/` folder to follow the example below. Review comment: To follow along with this example, download ... to your `/tmp` directory. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394553419 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -442,3 +455,114 @@ apache drill (information_schema)> SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WH +---+--++-+--++-+---+--++---+-+---++---++-+---+---+-+-+---+---+---+ 17 rows selected (0.183 seconds) ``` + +### Provisioning schema for Drill Metastore + + Directory and File Setup + +Set up storage plugin for desired file system, as described here: + [Connecting Drill to a File System]({{site.baseurl}}/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system). Review comment: Ensure you have configured the file system storage plugin as described here: ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394576664 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible with the CREATE OR REP | +--+ +### Altering Schema for a Table +Table schema may be updated using the `ALTER SCHEMA` commands. + +The syntax for the command to add (or replace) columns / properties is the following: + +ALTER SCHEMA +(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json') +ADD [OR REPLACE] +[COLUMNS (col1 int, col2 varchar)] +[PROPERTIES ('prop1'='val1', 'prop2'='val2')] + +Add command will fail if column or property with the same name exists, unless `OR REPLACE` keywords are indicated. Review comment: `ALTER SCHEMA` modifies an existing schema file; it will fail if the schema file does not exist. (Use `CREATE SCHEMA` to create a new schema file.) To prevent accidental changes, the `ALTER SCHEMA ... ADD` command will fail if the requested column or property already exists. Use the `OR REPLACE` clause to modify an existing column or property. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394559182 ## File path: _docs/sql-reference/sql-commands/007-analyze-table-refresh-metadata.md ## @@ -34,10 +34,16 @@ The name of the table or directory for which Drill will collect table metadata. Table function parameters. This syntax is only available since Drill 1.18. Example of table function parameters usage: -table(dfs.`table_name` (type => 'parquet', autoCorrectCorruptDates => true)) + table(dfs.tmp.`text_nation` (type=>'text', fieldDelimiter=>',', extractHeader=>true, +schema=>'inline=( +`n_nationkey` INT not null, +`n_name` VARCHAR not null, +`n_regionkey` INT not null, +`n_comment` VARCHAR not null)' +)) For detailed information, please refer to - [Using the Formats Attributes as Table Function Parameters]({{site.baseurl}}/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters) + [Specifying the Schema as Table Function Parameter]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter) Review comment: Please refer to ... for the details. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394556426 ## File path: _docs/query-data/query-a-file-system/009-querying-avro-files.md ## @@ -3,5 +3,30 @@ title: "Querying Avro Files" date: 2019-04-16 parent: "Querying a File System" --- - -The Avro format is experimental at this time. There are known issues when querying Avro files. + +Drill provides functionality to query [Avro](https://avro.apache.org/) files. Review comment: Drill supports files in the [Avro](https://avro.apache.org/) format. Starting from Drill 1.18, the Avro format supports the [Schema provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes) feature. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394576942 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible with the CREATE OR REP | +--+ +### Altering Schema for a Table +Table schema may be updated using the `ALTER SCHEMA` commands. + +The syntax for the command to add (or replace) columns / properties is the following: + +ALTER SCHEMA +(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json') +ADD [OR REPLACE] +[COLUMNS (col1 int, col2 varchar)] +[PROPERTIES ('prop1'='val1', 'prop2'='val2')] + +Add command will fail if column or property with the same name exists, unless `OR REPLACE` keywords are indicated. +Add command will fail, if the schema file does not exist. + +The syntax for the command to remove columns / properties is the following: Review comment: You can remove columns or property with the `ALTER SCHEMA ... REMOVE` command: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394558865 ## File path: _docs/query-data/query-a-file-system/009-querying-avro-files.md ## @@ -3,5 +3,30 @@ title: "Querying Avro Files" date: 2019-04-16 parent: "Querying a File System" --- - -The Avro format is experimental at this time. There are known issues when querying Avro files. + +Drill provides functionality to query [Avro](https://avro.apache.org/) files. + +Starting from Drill 1.18, Avro file format supports [Schema provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes) feature. + + Preparing example data + +Download the following [sample data file](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/avro/map_string_to_long.avro) +and place it to the `/tmp/` folder to follow the example below. + + Selecting data from Avro files + +To view the data in the `map_string_to_long.avro` file, issue the following query: Review comment: We can query all data from the `map_string_to_long.avro` file to see (what?) (Are we showing schema provisioning? Where did we create the schema file? Suggestion: show the file without a schema file. Identify the problem we want to fix. Create the schema file and do the query again, showing how we fixed the problem. As it is, as I tried to reword the sentence, I realized I'm not entirely clear what we're showing.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394577436 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible with the CREATE OR REP | +--+ +### Altering Schema for a Table +Table schema may be updated using the `ALTER SCHEMA` commands. + +The syntax for the command to add (or replace) columns / properties is the following: + +ALTER SCHEMA +(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json') +ADD [OR REPLACE] +[COLUMNS (col1 int, col2 varchar)] +[PROPERTIES ('prop1'='val1', 'prop2'='val2')] + +Add command will fail if column or property with the same name exists, unless `OR REPLACE` keywords are indicated. +Add command will fail, if the schema file does not exist. + +The syntax for the command to remove columns / properties is the following: + +ALTER SCHEMA +(FOR TABLE dfs.tmp.nation | PATH '/tmp/schema.json') +REMOVE +[COLUMNS (col1 int, col2 varchar)] +[PROPERTIES ('prop1'='val1', 'prop2'='val2')] + +Remove command won't fail if the column or property does not exist but will fail if the schema file is absent. Review comment: The command fails if the schema file does not exist. The command silently ignores a request to remove a column or property which does not exist. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394552697 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -117,6 +119,14 @@ Drill can connect to any number of data sources, each of which may have its own As a result, the Metastore labels tables with a combination of (plugin configuration name, workspace name, table name). Note that if before renaming any of these items, you must delete table's Metadata entry and recreate it after renaming. +### Using schema provisioning feature with Drill Metastore + +Drill Metastore allows specifying schema using the same syntax as + [Schema provisioning]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter) feature when used as a table function. +User can specify table schema in the `ANALYZE` command, so it will be used for collecting table statistics and will be stored + to Drill Metastore to be used when submitting queries for this table similar to the case when user specifies schema + explicitly in the table function. Review comment: The Drill Metastore holds both schema and statistics information for a table. The `ANALYZE` command can infer the table schema for well-defined tables (such as many Parquet tables). Some tables are too complex or variable for Drill's schema inference to work well. For example, JSON tables often omit fields or have long runs of nulls so that Drill cannot determine column types. In these cases you can specify the correct schema based on your knowledge of the a table's structure. You specify a schema in the `ANALYZE` command using the [Schema provisioning]({{site.baseurl}}/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter) syntax. (Please provide an example.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394537808 ## File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md ## @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))`` For more information about format plugin configuration see ["Text Files: CSV, TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/). +## Specifying the Schema as Table Function Parameter + +Starting from Drill 1.17, table schema may be indicated in the query using table function. + +It is useful when the user does not want to persist schema in table root location or when reading from file, not folder. +Schema parameter can be used as an individual unit or together with format plugin table properties. + +Schema can be provided in the `SCHEMA` property inline or using the file. + +The syntax for inline schema is similar to the [CREATE OR REPLACE SCHEMA]({{site.baseurl}}/docs/create-or-replace-schema/#syntax): + +``` +SELECT a, b FROM TABLE (table_name( +SCHEMA => 'inline=(column_name data_type [nullability] [format] [default] [properties {prop='val', ...})]')) +``` + +Example of usage: + +``` +select * from table(dfs.tmp.`text_table`( +schema => 'inline=(col1 date properties {`drill.format` = `-MM-dd`}) +properties {`drill.strict` = `false`}')) +``` + +The syntax for indicating schema using the path: + +``` +select * from table(dfs.tmp.`text_table`(schema => 'path=`/tmp/my_schema`')) +``` + +The following example demonstrates applying provided schema alongside with format plugin table function parameters. +Assuming that the user has CSV file with headers with extension that does not comply to a default text file with headers extension (ex: `cars.csvh-test`): Review comment: Suppose that you have a CSV file with headers and with a custom extension: `csvh-test`. You can combine the schema with format plugin properties: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394544903 ## File path: _docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md ## @@ -49,54 +49,53 @@ fileSystemMetadata|true|Set to true to extract filesystem metadata including the descriptive|true|Set to true to extract metadata in a human-readable string format. Set false to extract metadata in a machine-readable typed format. timeZone|null|Specify the time zone to interpret the timestamp with no time zone information. If the timestamp includes the time zone information, this value is ignored. If null is set, the local time zone is used. -##Examples +## Examples + +Download the following image and place it to the `/tmp` folder to follow the examples. Review comment: To follow along with the examples, start by downloading the following image to your `\tmp` directory. (The documentation seems more friendly and approachable if we address the user directly and avoid the passive voice.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394548649 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -1,14 +1,16 @@ --- title: "Using Drill Metastore" parent: "Drill Metastore" -date: 2020-03-03 +date: 2020-03-17 --- Drill 1.17 introduces the Drill Metastore which stores the table schema and table statistics. Statistics allow Drill to better create optimal query plans. The Metastore is a Beta feature; it is subject to change. We encourage you to try it and provide feedback. Because the Metastore is in Beta, the SQL commands and Metastore formats may change in the next release. -{% include startnote.html %}In Drill 1.17, this feature is supported for Parquet tables only and is disabled by default.{% include endnote.html %} +{% include startnote.html %}In Drill 1.17, this feature is supported for Parquet tables only and is disabled by default. +Starting from Drill 1.18, this feature is supported for all **format** plugins except for MaprDB. +{% include endnote.html %} Review comment: The Metastore is a beta feature and is subject to change. In particular, the SQL commands and Metastore format may change based on your experience and feedback. * In Drill 1.17, Metastore supports only tables in Parquet format. The feature is disabled by default. * In Drill 1.18, Metastore supports all format plugins (except MaprDB) for the file system plugin. The feature is still disabled by default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394536560 ## File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md ## @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))`` For more information about format plugin configuration see ["Text Files: CSV, TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/). +## Specifying the Schema as Table Function Parameter + +Starting from Drill 1.17, table schema may be indicated in the query using table function. + +It is useful when the user does not want to persist schema in table root location or when reading from file, not folder. +Schema parameter can be used as an individual unit or together with format plugin table properties. + +Schema can be provided in the `SCHEMA` property inline or using the file. + +The syntax for inline schema is similar to the [CREATE OR REPLACE SCHEMA]({{site.baseurl}}/docs/create-or-replace-schema/#syntax): + +``` +SELECT a, b FROM TABLE (table_name( +SCHEMA => 'inline=(column_name data_type [nullability] [format] [default] [properties {prop='val', ...})]')) +``` + +Example of usage: + +``` +select * from table(dfs.tmp.`text_table`( +schema => 'inline=(col1 date properties {`drill.format` = `-MM-dd`}) +properties {`drill.strict` = `false`}')) +``` + +The syntax for indicating schema using the path: Review comment: Alternatively, can also specify the path to a schema file. For example: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394544334 ## File path: _docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md ## @@ -142,4 +153,20 @@ You may need to qualify a table name with a schema name for Drill to return data | 2 | 1.2.3.5 | +---+--+ +### Example of Postgres Configuration with `sourceParameters` configuration property +{ + type: "jdbc", + enabled: true, + driver: "org.postgresql.Driver", + url:"jdbc:postgresql://1.2.3.4/mydatabase?defaultRowFetchSize=2", + username:"user", + password:"password", Review comment: (Nit: for display formatting, please include a space after the colon.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394553793 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -442,3 +455,114 @@ apache drill (information_schema)> SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WH +---+--++-+--++-+---+--++---+-+---++---++-+---+---+-+-+---+---+---+ 17 rows selected (0.183 seconds) ``` + +### Provisioning schema for Drill Metastore + + Directory and File Setup + +Set up storage plugin for desired file system, as described here: + [Connecting Drill to a File System]({{site.baseurl}}/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system). + +Set `store.format` to `csvh`: + +``` +SET `store.format`='csvh'; ++--+---+ +| ok |summary| ++--+---+ +| true | store.format updated. | ++--+---+ +``` + +Create text table based on the sample `/tpch/nation.parquet` table from `cp` plugin: Review comment: Create a text table... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394538417 ## File path: _docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md ## @@ -9,14 +9,25 @@ As with any source, Drill supports joins within and between all systems. Drill a ## Using the RDBMS Storage Plugin -Drill is designed to work with any relational datastore that provides a JDBC driver. Drill is actively tested with Postgres, MySQL, Oracle, MSSQL and Apache Derby. For each system, you will follow three basic steps for setup: +Drill is designed to work with any relational datastore that provides a JDBC driver. Drill is actively tested with + Postgres, MySQL, Oracle, MSSQL, Apache Derby and H2. For each system, you will follow three basic steps for setup: 1. [Install Drill]({{ site.baseurl }}/docs/installing-drill-in-embedded-mode), if you do not already have it installed. 2. Copy your database's JDBC driver into the jars/3rdparty directory. (You'll need to do this on every node.) 3. Restart Drill. See [Starting Drill in Distributed Mode]({{site.baseurl}}/docs/starting-drill-in-distributed-mode/). - 4. Add a new storage configuration to Drill through the Web UI. Example configurations for [Oracle](#Example-Oracle-Configuration), [SQL Server](#Example-SQL-Server-Configuration), [MySQL](#Example-MySQL-Configuration) and [Postgres](#Example-Postgres-Configuration) are provided below. - -**Example: Working with MySQL** + 4. Add a new storage configuration to Drill through the Web UI. Example configurations for [Oracle](#example-oracle-configuration), [SQL Server](#example-sql-server-configuration), [MySQL](#example-mysql-configuration) and [Postgres](#example-postgres-configuration) are provided below. + +## Setting data source parameters in the storage plugin configuration + +Starting from Drill 1.18.0, new JDBC storage plugin configuration property `sourceParameters` was introduced to allow Review comment: Drill's JDBC storage plugin configuration allows you to specify database parameters as JSON key/value pairs. Drill 1.18 introduced a new JDBC storage plugin property called `sourceParameters` to handle query parameter names which are not valid JSON identifiers. See [HikariCP](https://github.com/brettwooldridge/HikariCP#configuration-knobs-baby) for details. See [Example of Postgres Configuration with `sourceParameters` configuration property](#example-of-postgres-configuration-with-sourceparameters-configuration-property) for an example. (Note: please specify which parameters we're talking about. I made up the "database parameter" part; please replace with an accurate description.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394535049 ## File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md ## @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))`` For more information about format plugin configuration see ["Text Files: CSV, TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/). +## Specifying the Schema as Table Function Parameter + +Starting from Drill 1.17, table schema may be indicated in the query using table function. + +It is useful when the user does not want to persist schema in table root location or when reading from file, not folder. +Schema parameter can be used as an individual unit or together with format plugin table properties. Review comment: (Combine four paragraphs.) Table schemas normally reside in the root folder of each table. You an also specify a schema for an individual query using a table function and specifying the `SCHEMA` property. You can combine the schema with format plugin properties. The syntax is similar... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394536724 ## File path: _docs/connect-a-data-source/035-plugin-configuration-basics.md ## @@ -147,6 +149,45 @@ fieldDelimiter => ',', extractHeader => true))`` For more information about format plugin configuration see ["Text Files: CSV, TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/). +## Specifying the Schema as Table Function Parameter + +Starting from Drill 1.17, table schema may be indicated in the query using table function. + +It is useful when the user does not want to persist schema in table root location or when reading from file, not folder. +Schema parameter can be used as an individual unit or together with format plugin table properties. + +Schema can be provided in the `SCHEMA` property inline or using the file. + +The syntax for inline schema is similar to the [CREATE OR REPLACE SCHEMA]({{site.baseurl}}/docs/create-or-replace-schema/#syntax): + +``` +SELECT a, b FROM TABLE (table_name( +SCHEMA => 'inline=(column_name data_type [nullability] [format] [default] [properties {prop='val', ...})]')) +``` + +Example of usage: Review comment: You can specify the schema inline within the query. For example: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394555748 ## File path: _docs/performance-tuning/drill-metastore/010-using-drill-metastore.md ## @@ -442,3 +455,114 @@ apache drill (information_schema)> SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WH +---+--++-+--++-+---+--++---+-+---++---++-+---+---+-+-+---+---+---+ 17 rows selected (0.183 seconds) ``` + +### Provisioning schema for Drill Metastore + + Directory and File Setup + +Set up storage plugin for desired file system, as described here: + [Connecting Drill to a File System]({{site.baseurl}}/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system). + +Set `store.format` to `csvh`: + +``` +SET `store.format`='csvh'; ++--+---+ +| ok |summary| ++--+---+ +| true | store.format updated. | ++--+---+ +``` + +Create text table based on the sample `/tpch/nation.parquet` table from `cp` plugin: + +``` +create table dfs.tmp.text_nation as (select * from cp.`/tpch/nation.parquet`); ++--+---+ +| Fragment | Number of records written | ++--+---+ +| 0_0 | 25| ++--+---+ +``` + +Query the table `text_nation`: + +``` +SELECT count(*) FROM dfs.tmp.`text_nation`; +++ +| EXPR$0 | +++ +| 25 | +++ +``` Review comment: (Suggestion: since we are applying a schema, show the original types using the clunky `typeof()` functions. This will show that the columns start as `VARCHAR`, but that applying the schema gives them more useful types. Otherwise, I think the point may be lost on most users. And, yes, we should have a `DESCRIBE TABLE` to do the job instead of `SELECT typeof(n_nationkey), typeof(...`) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394545980 ## File path: _docs/connect-a-data-source/plugins/114-image-metadata-format-plugin.md ## @@ -49,54 +49,53 @@ fileSystemMetadata|true|Set to true to extract filesystem metadata including the descriptive|true|Set to true to extract metadata in a human-readable string format. Set false to extract metadata in a machine-readable typed format. timeZone|null|Specify the time zone to interpret the timestamp with no time zone information. If the timestamp includes the time zone information, this value is ignored. If null is set, the local time zone is used. -##Examples +## Examples + +Download the following image and place it to the `/tmp` folder to follow the examples. + +[![image]({{ site.baseurl }}/images/7671b34d6e8a4d050f75278f10f1a08.jpg)]({{ site.baseurl }}/images/7671b34d6e8a4d050f75278f10f1a08.jpg) A Drill query on a JPEG file with the property descriptive: true - 0: jdbc:drill:zk=local> select FileName, * from dfs.`4349313028_f69ffa0257_o.jpg`; - +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+ - | FileName | FileSize | FileDateTime | Format | PixelWidth | PixelHeight | BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | HasAlpha | Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | GPS | ExifThumbnail | Photoshop | IPTC | Huffman | FileType | - +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+ - | 4349313028_f69ffa0257_o.jpg | 257213 bytes | Fri Mar 09 12:09:34 +08:00 2018 | JPEG | 1199 | 800 | 24 | 96 | 96 | Unknown (0) | RGB | false | 00:00:00 | Unknown | 0 | Unknown | 0 | 0 | {"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"800 pixels","ImageWidth":"1199 pixels","NumberOfComponents":"3","Component1":"Y component: Quantization table 0, Sampling factors 2 horiz/2 vert","Component2":"Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert","Component3":"Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert"} | {"Version":"1.1","ResolutionUnits":"inch","XResolution":"96 dots","YResolution":"96 dots","ThumbnailWidthPixels":"0","ThumbnailHeightPixels":"0"} | {"Software":"Picasa 3.0"} | {"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} | {"InteroperabilityIndex":"Unknown ()","InteroperabilityVersion":"1.00"} | {"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32' 15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2' 6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} | {"Compression":"JPEG (old-style)","XResolution":"72 dots per inch","YResolution":"72 dots per inch","ResolutionUnit":"Inch","ThumbnailOffset":"414 bytes","ThumbnailLength":"7213 bytes"} | {} | {"Keywords":"135;2002;issaquah;police car;wa;washington"} | {"NumberOfTables":"4 Huffman tables"} | {"DetectedFileTypeName":"JPEG","DetectedFileTypeLongName":"Joint Photographic Experts Group","DetectedMIMEType":"image/jpeg","ExpectedFileNameExtension":"jpg"} | - +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+ - +select FileName, * from dfs.tmp.`7671b34d6e8a4d050f75278f10f1a08.jpg`; +
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394572017 ## File path: _docs/sql-reference/sql-commands/021-create-schema.md ## @@ -604,8 +608,32 @@ STATEMENT format displays the schema in a form compatible with the CREATE OR REP | +--+ +### Altering Schema for a Table +Table schema may be updated using the `ALTER SCHEMA` commands. Review comment: Use the `ALTER SCHEMA` command to update your table schema. The command can add or replace columns. Or, it can update properties for the table or individual columns. Syntax: ((Replaces next sentence also.)) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported
paul-rogers commented on a change in pull request #2030: Update docs for Metastore to point that all format plugins are supported URL: https://github.com/apache/drill/pull/2030#discussion_r394543770 ## File path: _docs/connect-a-data-source/plugins/080-rdbms-storage-plugin.md ## @@ -101,9 +112,9 @@ For MySQL, Drill has been tested with MySQL's [mysql-connector-java-5.1.37-bin.j password:"password" } -**Example Postgres Configuration** +### Example Postgres Configuration -For Postgres, Drill has been tested with Postgres's [9.1-901-1.jdbc4](http://central.maven.org/maven2/org/postgresql/postgresql/) driver (any recent driver should work). Copy this driver file to all nodes. +For Postgres, Drill has been tested with Postgres's [42.2.11](https://mvnrepository.com/artifact/org.postgresql/postgresql) driver (any recent driver should work). Copy this driver file to all nodes. Review comment: Drill is tested with the Postgres driver version ... Copy this driver jar (?) to the (which?) folder on all nodes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services