[ 
https://issues.apache.org/jira/browse/DRILL-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6965:
------------------------------------
    Labels: doc-impacting  (was: )

> Adjust table function usage for all storage plugins and implement schema 
> parameter
> ----------------------------------------------------------------------------------
>
>                 Key: DRILL-6965
>                 URL: https://issues.apache.org/jira/browse/DRILL-6965
>             Project: Apache Drill
>          Issue Type: Sub-task
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>            Priority: Major
>              Labels: doc-impacting
>             Fix For: 1.17.0
>
>
> Schema can be used while reading the table into two ways:
> a. schema is created in the table root folder using CREATE SCHEMA command and 
> schema usage command is enabled;
> b. schema indicated in table function.
> This Jira implements point b.
> Schema indication using table function is useful when user does not want to 
> persist schema in table root location or when reading from file, not folder.
> Schema parameter can be used as individual unit or in together with for 
> format plugin table properties.
> Usage examples:
> Pre-requisites: 
> V3 reader must be enabled: {{set `exec.storage.enable_v3_text_reader` = 
> true;}}
> Query examples:
> 1. There is folder with files or just one file (ex: dfs.tmp.text_table) and 
> user wants to apply schema to them:
> a. indicate schema inline:
> {noformat}
> select * from table(dfs.tmp.`text_table`(schema => 'inline=(col1 date 
> properties {`drill.format` = `yyyy-MM-dd`} properties {`drill.strict` = 
> `false`})'))
> {noformat}
> b. indicate schema using path:
> First schema was created in some location using CREATE SCHEMA command. For 
> example:
> {noformat}
> create schema 
> (col int)
> path '/tmp/my_schema'
> {noformat}
> Now user wants to apply this schema in table function:
> {noformat}
> select * from table(dfs.tmp.`text_table`(schema => 'path=`/tmp/my_schema`'))
> {noformat}
> 2. User want to apply schema along side with format plugin table function 
> parameters.
> Assume user has CSV file with headers with extension that does not comply to 
> default text file with headers extension (ex: cars.csvh-test):
> {noformat}
> select * from table(dfs.tmp.`cars.csvh-test`(type => 'text', fieldDelimiter 
> => ',', extractHeader => true, schema => 'inline=(col1 date)'))
> {noformat}
> More details about syntax can be found in design document:
> https://docs.google.com/document/d/1mp4egSbNs8jFYRbPVbm_l0Y5GjH3HnoqCmOpMTR_g4w/edit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to