[ 
https://issues.apache.org/jira/browse/DRILL-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-6965:
---------------------------------------
    Labels: doc-impacting ready-to-commit  (was: doc-impacting)

> Adjust table function usage for all storage plugins and implement schema 
> parameter
> ----------------------------------------------------------------------------------
>
>                 Key: DRILL-6965
>                 URL: https://issues.apache.org/jira/browse/DRILL-6965
>             Project: Apache Drill
>          Issue Type: Sub-task
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>            Priority: Major
>              Labels: doc-impacting, ready-to-commit
>             Fix For: 1.17.0
>
>
> Schema can be used while reading the table into two ways:
>  a. schema is created in the table root folder using CREATE SCHEMA command 
> and schema usage command is enabled;
>  b. schema indicated in table function.
>  This Jira implements point b.
> Schema indication using table function is useful when user does not want to 
> persist schema in table root location or when reading from file, not folder.
> Schema parameter can be used as individual unit or in together with for 
> format plugin table properties.
> Usage examples:
> Pre-requisites: 
>  V3 reader must be enabled: {{set `exec.storage.enable_v3_text_reader` = 
> true;}}
> Query examples:
> 1. There is folder with files or just one file (ex: dfs.tmp.text_table) and 
> user wants to apply schema to them:
>  a. indicate schema inline:
> {noformat}
> select * from table(dfs.tmp.`text_table`(
> schema => 'inline=(col1 date properties {`drill.format` = `yyyy-MM-dd`}) 
> properties {`drill.strict` = `false`}'))
> {noformat}
> To indicate only table properties use the following syntax:
> {noformat}
> select * from table(dfs.tmp.`text_table`(
> schema => 'inline=() 
> properties {`drill.strict` = `false`}'))
> {noformat}
> b. indicate schema using path:
>  First schema was created in some location using CREATE SCHEMA command. For 
> example:
> {noformat}
> create schema 
> (col int)
> path '/tmp/my_schema'
> {noformat}
> Now user wants to apply this schema in table function:
> {noformat}
> select * from table(dfs.tmp.`text_table`(schema => 'path=`/tmp/my_schema`'))
> {noformat}
> 2. User wants to apply schema along side with format plugin table function 
> parameters.
>  Assuming that user has CSV file with headers with extension that does not 
> comply to default text file with headers extension (ex: cars.csvh-test):
> {noformat}
> select * from table(dfs.tmp.`cars.csvh-test`(type => 'text', 
> fieldDelimiter => ',', extractHeader => true,
> schema => 'inline=(col1 date)'))
> {noformat}
> More details about syntax can be found in design document:
>  
> [https://docs.google.com/document/d/1mp4egSbNs8jFYRbPVbm_l0Y5GjH3HnoqCmOpMTR_g4w/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to