http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-fs/001-query-json.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/query/query-fs/001-query-json.md b/_docs/drill-docs/query/query-fs/001-query-json.md deleted file mode 100644 index 048903b..0000000 --- a/_docs/drill-docs/query/query-fs/001-query-json.md +++ /dev/null @@ -1,41 +0,0 @@ ---- -title: "Querying JSON Files" -parent: "Querying a File System" ---- -Your Drill installation includes a sample JSON file located in Drill's -classpath. The sample JSON file, `employee.json`, contains fictitious employee -data. Use SQL syntax to query the sample `JSON` file. - -To view the data in the `employee.json` file, submit the following SQL query -to Drill: - -``0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json`;`` - -The query returns the following results: - -**Example of partial output** - - +-------------+------------+------------+------------+-------------+-----------+ - | employee_id | full_name | first_name | last_name | position_id | position_ | - +-------------+------------+------------+------------+-------------+-----------+ - | 1101 | Steve Eurich | Steve | Eurich | 16 | Store T | - | 1102 | Mary Pierson | Mary | Pierson | 16 | Store T | - | 1103 | Leo Jones | Leo | Jones | 16 | Store Tem | - | 1104 | Nancy Beatty | Nancy | Beatty | 16 | Store T | - | 1105 | Clara McNight | Clara | McNight | 16 | Store | - | 1106 | Marcella Isaacs | Marcella | Isaacs | 17 | Stor | - | 1107 | Charlotte Yonce | Charlotte | Yonce | 17 | Stor | - | 1108 | Benjamin Foster | Benjamin | Foster | 17 | Stor | - | 1109 | John Reed | John | Reed | 17 | Store Per | - | 1110 | Lynn Kwiatkowski | Lynn | Kwiatkowski | 17 | St | - | 1111 | Donald Vann | Donald | Vann | 17 | Store Pe | - | 1112 | William Smith | William | Smith | 17 | Store | - | 1113 | Amy Hensley | Amy | Hensley | 17 | Store Pe | - | 1114 | Judy Owens | Judy | Owens | 17 | Store Per | - | 1115 | Frederick Castillo | Frederick | Castillo | 17 | S | - | 1116 | Phil Munoz | Phil | Munoz | 17 | Store Per | - | 1117 | Lori Lightfoot | Lori | Lightfoot | 17 | Store | - ... - +-------------+------------+------------+------------+-------------+-----------+ - 1,155 rows selected (0.762 seconds) - 0: jdbc:drill:zk=local> \ No newline at end of file
http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-fs/002-query-parquet.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/query/query-fs/002-query-parquet.md b/_docs/drill-docs/query/query-fs/002-query-parquet.md deleted file mode 100644 index 9b4e874..0000000 --- a/_docs/drill-docs/query/query-fs/002-query-parquet.md +++ /dev/null @@ -1,99 +0,0 @@ ---- -title: "Querying Parquet Files" -parent: "Querying a File System" ---- -Your Drill installation includes a `sample-date` directory with Parquet files -that you can query. Use SQL syntax to query the `region.parquet` and -`nation.parquet` files in the `sample-data` directory. - -**Note:** Your Drill installation location may differ from the examples used here. The examples assume that Drill was installed in embedded mode on your machine following the [Apache Drill in 10 Minutes ](https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+in+10+Minutes)tutorial. If you installed Drill in distributed mode, or your `sample-data` directory differs from the location used in the examples, make sure to change the `sample-data` directory to the correct location before you run the queries. - -#### Region File - -If you followed the Apache Drill in 10 Minutes instructions to install Drill -in embedded mode, the path to the parquet file varies between operating -systems. - -To view the data in the `region.parquet` file, issue the query appropriate for -your operating system: - - * Linux -``SELECT * FROM dfs.`/opt/drill/apache-drill-0.4.0-incubating/sample- -data/region.parquet`; `` - - * Mac OS X -``SELECT * FROM dfs.`/Users/max/drill/apache-drill-0.4.0-incubating/sample- -data/region.parquet`;`` - - * Windows -``SELECT * FROM dfs.`C:\drill\apache-drill-0.4.0-incubating\sample- -data\region.parquet`;`` - -The query returns the following results: - - +------------+------------+ - | EXPR$0 | EXPR$1 | - +------------+------------+ - | AFRICA | lar deposits. blithely final packages cajole. regular waters ar | - | AMERICA | hs use ironic, even requests. s | - | ASIA | ges. thinly even pinto beans ca | - | EUROPE | ly final courts cajole furiously final excuse | - | MIDDLE EAST | uickly special accounts cajole carefully blithely close reques | - +------------+------------+ - 5 rows selected (0.165 seconds) - 0: jdbc:drill:zk=local> - -#### Nation File - -If you followed the Apache Drill in 10 Minutes instructions to install Drill -in embedded mode, the path to the parquet file varies between operating -systems. - -To view the data in the `nation.parquet` file, issue the query appropriate for -your operating system: - - * Linux -``SELECT * FROM dfs.`/opt/drill/apache-drill-0.4.0-incubating/sample- -data/nation.parquet`; `` - - * Mac OS X -``SELECT * FROM dfs.`/Users/max/drill/apache-drill-0.4.0-incubating/sample- -data/nation.parquet`;`` - - * Windows -``SELECT * FROM dfs.`C:\drill\apache-drill-0.4.0-incubating\sample- -data\nation.parquet`;`` - -The query returns the following results: - - +------------+------------+------------+------------+ - | EXPR$0 | EXPR$1 | EXPR$2 | EXPR$3 | - +------------+------------+------------+------------+ - | 0 | 0 | ALGERIA | haggle. carefully final deposits det | - | 1 | 1 | ARGENTINA | al foxes promise slyly according to t | - | 2 | 1 | BRAZIL | y alongside of the pending deposits. | - | 3 | 1 | CANADA | eas hang ironic, silent packages. sly | - | 4 | 4 | EGYPT | y above the carefully unusual theodol | - | 5 | 0 | ETHIOPIA | ven packages wake quickly. regu | - | 6 | 3 | FRANCE | refully final requests. regular, iron | - | 7 | 3 | GERMANY | l platelets. regular accounts x-ray: | - | 8 | 2 | INDIA | ss excuses cajole slyly across the pa | - | 9 | 2 | INDONESIA | slyly express asymptotes. regular de | - | 10 | 4 | IRAN | efully alongside of the slyly final d | - | 11 | 4 | IRAQ | nic deposits boost atop the quickly f | - | 12 | 2 | JAPAN | ously. final, express gifts cajole a | - | 13 | 4 | JORDAN | ic deposits are blithely about the ca | - | 14 | 0 | KENYA | pending excuses haggle furiously dep | - | 15 | 0 | MOROCCO | rns. blithely bold courts among the c | - | 16 | 0 | MOZAMBIQUE | s. ironic, unusual asymptotes wake bl | - | 17 | 1 | PERU | platelets. blithely pending dependenc | - | 18 | 2 | CHINA | c dependencies. furiously express not | - | 19 | 3 | ROMANIA | ular asymptotes are about the furious | - | 20 | 4 | SAUDI ARABIA | ts. silent requests haggle. closely | - | 21 | 2 | VIETNAM | hely enticingly express accounts. eve | - | 22 | 3 | RUSSIA | requests against the platelets use n | - | 23 | 3 | UNITED KINGDOM | eans boost carefully special requ | - | 24 | 1 | UNITED STATES | y final packages. slow foxes cajol | - +------------+------------+------------+------------+ - 25 rows selected (2.401 seconds) - 0: jdbc:drill:zk=local> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-fs/003-query-text.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/query/query-fs/003-query-text.md b/_docs/drill-docs/query/query-fs/003-query-text.md deleted file mode 100644 index d7ffad5..0000000 --- a/_docs/drill-docs/query/query-fs/003-query-text.md +++ /dev/null @@ -1,120 +0,0 @@ ---- -title: "Querying Plain Text Files" -parent: "Querying a File System" ---- -You can use Drill to access both structured file types and plain text files -(flat files). This section shows a few simple examples that work on flat -files: - - * CSV files (comma-separated values) - * TSV files (tab-separated values) - * PSV files (pipe-separated values) - -The examples here show CSV files, but queries against TSV and PSV files return -equivalent results. However, make sure that your registered storage plugins -recognize the appropriate file types and extensions. For example, the -following configuration expects PSV files (files with a pipe delimiter) to -have a `tbl` extension, not a `psv` extension. Drill returns a "file not -found" error if references to files in queries do not match these conditions. - - "formats": { - "psv": { - "type": "text", - "extensions": [ - "tbl" - ], - "delimiter": "|" - } - -## SELECT * FROM a CSV File - -The first query selects rows from a `.csv` text file. The file contains seven -records: - - $ more plays.csv - - 1599,As You Like It - 1601,Twelfth Night - 1594,Comedy of Errors - 1595,Romeo and Juliet - 1596,The Merchant of Venice - 1610,The Tempest - 1599,Hamlet - -Drill recognizes each row as an array of values and returns one column for -each row. - -0: jdbc:drill:zk=local> select * from dfs.`/Users/brumsby/drill/plays.csv`; - - +------------+ - | columns | - +------------+ - | ["1599","As You Like It"] | - | ["1601","Twelfth Night"] | - | ["1594","Comedy of Errors"] | - | ["1595","Romeo and Juliet"] | - | ["1596","The Merchant of Venice"] | - | ["1610","The Tempest"] | - | ["1599","Hamlet"] | - +------------+ - 7 rows selected (0.089 seconds) - -## Columns[n] Syntax - -You can use the `COLUMNS[n]` syntax in the SELECT list to return these CSV -rows in a more readable, column by column, format. (This syntax uses a zero- -based index, so the first column is column `0`.) - -0: jdbc:drill:zk=local> select columns[0], columns[1] -from dfs.`/Users/brumsby/drill/plays.csv`; - - +------------+------------+ - | EXPR$0 | EXPR$1 | - +------------+------------+ - | 1599 | As You Like It | - | 1601 | Twelfth Night | - | 1594 | Comedy of Errors | - | 1595 | Romeo and Juliet | - | 1596 | The Merchant of Venice | - | 1610 | The Tempest | - | 1599 | Hamlet | - +------------+------------+ - 7 rows selected (0.137 seconds) - -You can use aliases to return meaningful column names. Note that `YEAR` is a -reserved word, so the `Year` alias must be enclosed by back ticks. - - 0: jdbc:drill:zk=local> select columns[0] as `Year`, columns[1] as Play - from dfs.`/Users/brumsby/drill/plays.csv`; - - +------------+------------+ - | Year | Play | - +------------+------------+ - | 1599 | As You Like It | - | 1601 | Twelfth Night | - | 1594 | Comedy of Errors | - | 1595 | Romeo and Juliet | - | 1596 | The Merchant of Venice | - | 1610 | The Tempest | - | 1599 | Hamlet | - +------------+------------+ - 7 rows selected (0.113 seconds) - -You cannot refer to the aliases in subsequent clauses of the query. Use the -original `columns[n]` syntax, as shown in the WHERE clause for the following -example: - -0: jdbc:drill:zk=local> select columns[0] as `Year`, columns[1] as Play -from dfs.`/Users/brumsby/drill/plays.csv` where columns[0]>1599; - - +------------+------------+ - | Year | Play | - +------------+------------+ - | 1601 | Twelfth Night | - | 1610 | The Tempest | - +------------+------------+ - 2 rows selected (0.201 seconds) - -Note that the restriction with the use of aliases applies to queries against -all data sources. - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-fs/004-query-dir.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/query/query-fs/004-query-dir.md b/_docs/drill-docs/query/query-fs/004-query-dir.md deleted file mode 100644 index 47abee9..0000000 --- a/_docs/drill-docs/query/query-fs/004-query-dir.md +++ /dev/null @@ -1,90 +0,0 @@ ---- -title: "Querying Directories" -parent: "Querying a File System" ---- -You can store multiple files in a directory and query them as if they were a -single entity. You do not have to explicitly join the files. The files must be -compatible, in the sense that they must have comparable data types and columns -in the same order. This type of query is not limited to text files; you can -also query directories of JSON files, for example. - -For example, assume that a `testdata` directory contains two files with the -same structure: `plays.csv` and `moreplays.csv`. The first file contains 7 -records and the second file contains 3 records. The following query returns -the "union" of the two files, ordered by the first column: - - 0: jdbc:drill:zk=local> select columns[0] as `Year`, columns[1] as Play - from dfs.`/Users/brumsby/drill/testdata` order by 1; - - +------------+------------+ - | Year | Play | - +------------+------------+ - | 1594 | Comedy of Errors | - | 1595 | Romeo and Juliet | - | 1596 | The Merchant of Venice | - | 1599 | As You Like It | - | 1599 | Hamlet | - | 1601 | Twelfth Night | - | 1606 | Macbeth | - | 1606 | King Lear | - | 1609 | The Winter's Tale | - | 1610 | The Tempest | - +------------+------------+ - 10 rows selected (0.296 seconds) - -You can drill down further and automatically query subdirectories as well. For -example, assume that you have a logs directory that contains a subdirectory -for each year and subdirectories for each month (1 through 12). The month -directories contain JSON files. - - [root@ip-172-16-1-200 logs]# pwd - /mapr/drilldemo/labs/clicks/logs - [root@ip-172-16-1-200 logs]# ls - 2012 2013 2014 - [root@ip-172-16-1-200 logs]# cd 2013 - [root@ip-172-16-1-200 2013]# ls - 1 10 11 12 2 3 4 5 6 7 8 9 - -You can query all of these files, or a subset, by referencing the file system -once in a Drill query. For example, the following query counts the number of -records in all of the files inside the `2013` directory: - - 0: jdbc:drill:> select count(*) from MFS.`/mapr/drilldemo/labs/clicks/logs/2013` ; - +------------+ - | EXPR$0 | - +------------+ - | 24000 | - +------------+ - 1 row selected (2.607 seconds) - -You can also use variables `dir0`, `dir1`, and so on, to refer to -subdirectories in your workspace path. For example, assume that `bob.logdata` -is a workspace that points to the `logs` directory, which contains multiple -subdirectories: `2012`, `2013`, and `2014`. The following query constrains -files inside the subdirectory named `2013`. The variable `dir0` refers to the -first level down from logs, `dir1` to the next level, and so on. - - 0: jdbc:drill:> use bob.logdata; - +------------+------------+ - | ok | summary | - +------------+------------+ - | true | Default schema changed to 'bob.logdata' | - +------------+------------+ - 1 row selected (0.305 seconds) - - 0: jdbc:drill:> select * from logs where dir0='2013' limit 10; - +------------+------------+------------+------------+------------+------------+------------+------------+------------+-------------+ - | dir0 | dir1 | trans_id | date | time | cust_id | device | state | camp_id | keywords | - +------------+------------+------------+------------+------------+------------+------------+------------+------------+-------------+ - | 2013 | 2 | 12115 | 02/23/2013 | 19:48:24 | 3 | IOS5 | az | 5 | who's | - | 2013 | 2 | 12127 | 02/26/2013 | 19:42:03 | 11459 | IOS5 | wa | 10 | for | - | 2013 | 2 | 12138 | 02/09/2013 | 05:49:01 | 1 | IOS6 | ca | 7 | minutes | - | 2013 | 2 | 12139 | 02/23/2013 | 06:58:20 | 1 | AOS4.4 | ms | 7 | i | - | 2013 | 2 | 12145 | 02/10/2013 | 10:14:56 | 10 | IOS5 | mi | 6 | wrong | - | 2013 | 2 | 12157 | 02/15/2013 | 02:49:22 | 102 | IOS5 | ny | 5 | want | - | 2013 | 2 | 12176 | 02/19/2013 | 08:39:02 | 28 | IOS5 | or | 0 | and | - | 2013 | 2 | 12194 | 02/24/2013 | 08:26:17 | 125445 | IOS5 | ar | 0 | say | - | 2013 | 2 | 12236 | 02/05/2013 | 01:40:05 | 10 | IOS5 | nj | 2 | sir | - | 2013 | 2 | 12249 | 02/03/2013 | 04:45:47 | 21725 | IOS5 | nj | 5 | no | - +------------+------------+------------+------------+------------+------------+------------+------------+------------+-------------+ - 10 rows selected (0.583 seconds) \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/rn/001-0.5.0rn.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/rn/001-0.5.0rn.md b/_docs/drill-docs/rn/001-0.5.0rn.md deleted file mode 100644 index 2dff978..0000000 --- a/_docs/drill-docs/rn/001-0.5.0rn.md +++ /dev/null @@ -1,29 +0,0 @@ ---- -title: "Apache Drill 0.5.0 Release Notes" -parent: "Release Notes" ---- - -Apache Drill 0.5.0, the first beta release for Drill, is designed to help -enthusiasts start working and experimenting with Drill. It also continues the -Drill monthly release cycle as we drive towards general availability. - -The 0.5.0 release is primarily a bug fix release, with [more than 100 JIRAs](h -ttps://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&versi -on=12324880) closed, but there are some notable features. For information -about the features, see the [Apache Drill Blog for the 0.5.0 -release](https://blogs.apache.org/drill/entry/apache_drill_beta_release_see). - -This release is available as [binary](http://www.apache.org/dyn/closer.cgi/inc -ubator/drill/drill-0.5.0-incubating/apache-drill-0.5.0-incubating.tar.gz) and -[source](http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-0.5.0-incu -bating/apache-drill-0.5.0-incubating-src.tar.gz) tarballs that are compiled -against Apache Hadoop. Drill has been tested against MapR, Cloudera, and -Hortonworks Hadoop distributions. There are associated build profiles and -JIRAs that can help you run Drill against your preferred distribution. - -Apache Drill 0.5.0 Key Notes and Limitations - - * The current release supports in memory and beyond memory execution. However, you must disable memory-intensive hash aggregate and hash join operations to leverage this functionality. - * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior, such as Sort. Others operations, such as streaming aggregate, may have partial support that leads to unexpected results. - * There are known issues with joining text files without using an intervening view. See [DRILL-1401](https://issues.apache.org/jira/browse/DRILL-1401) for more information. - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/rn/002-0.4.0rn.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/rn/002-0.4.0rn.md b/_docs/drill-docs/rn/002-0.4.0rn.md deleted file mode 100644 index d0dac30..0000000 --- a/_docs/drill-docs/rn/002-0.4.0rn.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: "Apache Drill 0.4.0 Release Notes" -parent: "Release Notes" ---- -The 0.4.0 release is a developer preview release, designed to help enthusiasts -start to work with and experiment with Drill. It is the first Drill release -that provides distributed query execution. - -This release is built upon [more than 800 -JIRAs](https://issues.apache.org/jira/browse/DRILL/fixforversion/12324963/). -It is a pre-beta release on the way towards Drill. As a developer snapshot, -the release contains a large number of outstanding bugs that will make some -use cases challenging. Feel free to consult outstanding issues [targeted for -the 0.5.0 -release](https://issues.apache.org/jira/browse/DRILL/fixforversion/12324880/) -to see whether your use case is affected. - -To read more about this release and new features introduced, please view the -[0.4.0 announcement blog -entry](https://blogs.apache.org/drill/entry/announcing_apache_drill_0_4). - -The release is available as both [binary](http://www.apache.org/dyn/closer.cgi -/incubator/drill/drill-0.4.0-incubating/apache-drill-0.4.0-incubating.tar.gz) -and [source](http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-0.4.0- -incubating/apache-drill-0.4.0-incubating-src.tar.gz) tarballs. In both cases, -these are compiled against Apache Hadoop. Drill has also been tested against -MapR, Cloudera and Hortonworks Hadoop distributions and there are associated -build profiles or JIRAs that can help you run against your preferred -distribution. - -Some Key Notes & Limitations - - * The current release supports in memory and beyond memory execution. However, users must disable memory-intensive hash aggregate and hash join operations to leverage this functionality. - * In many cases,merge join operations return incorrect results. - * Use of a local filter in a join âonâ clause when using left, right or full outer joins may result in incorrect results. - * Because of known memory leaks and memory overrun issues you may need more memory and you may need to restart the system in some cases. - * Some types of complex expressions, especially those involving empty arrays may fail or return incorrect results. - * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior (such as Sort). Others operations (such as streaming aggregate) may have partial support that leads to unexpected results. - * Protobuf, UDF, query plan interfaces and all interfaces are subject to change in incompatible ways. - * Multiplication of some types of DECIMAL(28+,*) will return incorrect result. - - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/rn/003-alpha-rn.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/rn/003-alpha-rn.md b/_docs/drill-docs/rn/003-alpha-rn.md deleted file mode 100644 index 15e160c..0000000 --- a/_docs/drill-docs/rn/003-alpha-rn.md +++ /dev/null @@ -1,44 +0,0 @@ ---- -title: "Apache Drill M1 Release Notes (Apache Drill Alpha)" -parent: "Release Notes" ---- -### Milestone 1 Goals - -The first release of Apache Drill is designed as a technology preview for -people to better understand the architecture and vision. It is a functional -release tying to piece together the key components of a next generation MPP -query engine. It is designed to allow milestone 2 (M2) to focus on -architectural analysis and performance optimization. - - * Provide a new optimistic DAG execution engine for data analysis - * Build a new columnar shredded in-memory format and execution model that minimizes data serialization/deserialization costs and operator complexity - * Provide a model for runtime generated functions and relational operators that minimizes complexity and maximizes performance - * Support queries against columnar on disk format (Parquet) and JSON - * Support the most common set of standard SQL read-only phrases using ANSI standards. Includes: SELECT, FROM, WHERE, HAVING, ORDER, GROUP BY, IN, DISTINCT, LEFT JOIN, RIGHT JOIN, INNER JOIN - * Support schema-on-read querying and execution - * Build a set of columnar operation primitives including Merge Join, Sort, Streaming Aggregate, Filter, Selection Vector removal. - * Support unlimited level of subqueries and correlated subqueries - * Provided an extensible query-language agnostic JSON-base logical data flow syntax. - * Support complex data type manipulation via logical plan operations - -### Known Issues - -SQL Parsing -Because Apache Drill is built to support late-bound changing schemas while SQL -is statically typed, there are couple of special requirements that are -required writing SQL queries. These are limited to the current release and -will be correct in a future milestone release. - - * All tables are exposed as a single map field that contains - * Drill Alpha doesn't support implicit or explicit casts outside those required above. - * Drill Alpha does not include, there are currently a couple of differences for how to write a query in In order to query against - -UDFs - - * Drill currently supports simple and aggregate functions using scalar, repeated and - * Nested data support incomplete. Drill Alpha supports nested data structures as well repeated fields. However, - * asd - - - - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/rn/004-0.6.0-rn.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/rn/004-0.6.0-rn.md b/_docs/drill-docs/rn/004-0.6.0-rn.md deleted file mode 100644 index 22aa96b..0000000 --- a/_docs/drill-docs/rn/004-0.6.0-rn.md +++ /dev/null @@ -1,32 +0,0 @@ ---- -title: "Apache Drill 0.6.0 Release Notes (Apache Drill Alpha)" -parent: "Release Notes" ---- -Apache Drill 0.6.0, the second beta release for Drill, is designed to help -enthusiasts start working and experimenting with Drill. It also continues the -Drill monthly release cycle as we drive towards general availability. - -This release is available as [binary](http://www.apache.org/dyn/closer.cgi/inc -ubator/drill/drill-0.5.0-incubating/apache-drill-0.5.0-incubating.tar.gz) and -[source](http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-0.5.0-incu -bating/apache-drill-0.5.0-incubating-src.tar.gz) tarballs that are compiled -against Apache Hadoop. Drill has been tested against MapR, Cloudera, and -Hortonworks Hadoop distributions. There are associated build profiles and -JIRAs that can help you run Drill against your preferred distribution. - -Apache Drill 0.6.0 Key Features - -This release is primarily a bug fix release, with [more than 30 JIRAs closed]( -https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&vers -ion=12327472), but there are some notable features: - - * Direct ANSI SQL access to MongoDB, using the latest [MongoDB Plugin for Apache Drill](/confluence/display/DRILL/MongoDB+Plugin+for+Apache+Drill) - * Filesystem query performance improvements with partition pruning - * Ability to use the file system as a persistent store for query profiles and diagnostic information - * Window function support (alpha) - -Apache Drill 0.6.0 Key Notes and Limitations - - * The current release supports in-memory and beyond-memory execution. However, you must disable memory-intensive hash aggregate and hash join operations to leverage this functionality. - * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior, such as Sort. Other operations, such as streaming aggregate, may have partial support that leads to unexpected results. - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/rn/005-0.7.0-rn.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/rn/005-0.7.0-rn.md b/_docs/drill-docs/rn/005-0.7.0-rn.md deleted file mode 100644 index e25a58a..0000000 --- a/_docs/drill-docs/rn/005-0.7.0-rn.md +++ /dev/null @@ -1,56 +0,0 @@ ---- -title: "Apache Drill 0.7.0 Release Notes (Apache Drill Alpha)" -parent: "Release Notes" ---- -Apache Drill 0.7.0, the third beta release for Drill, is designed to help -enthusiasts start working and experimenting with Drill. It also continues the -Drill monthly release cycle as we drive towards general availability. - -This release is available as -[binary](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache- -drill-0.7.0.tar.gz) and -[source](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache- -drill-0.7.0-src.tar.gz) tarballs that are compiled against Apache Hadoop. -Drill has been tested against MapR, Cloudera, and Hortonworks Hadoop -distributions. There are associated build profiles and JIRAs that can help you -run Drill against your preferred distribution - -Apache Drill 0.7.0 Key Features - - * No more dependency on UDP/Multicast - Making it possible for Drill to work well in the following scenarios: - - * UDP multicast not enabled (as in EC2) - - * Cluster spans multiple subnets - - * Cluster has multihome configuration - - * New functions to natively work with nested data - KVGen and Flatten - - * Support for Hive 0.13 (Hive 0.12 with Drill is not supported any more) - - * Improved performance when querying Hive tables and File system through partition pruning - - * Improved performance for HBase with LIKE operator pushdown - - * Improved memory management - - * Drill web UI monitoring and query profile improvements - - * Ability to parse files without explicit extensions using default storage format specification - - * Fixes for dealing with complex/nested data objects in Parquet/JSON - - * Fast schema return - Improved experience working with BI/query tools by returning metadata quickly - - * Several hang related fixes - - * Parquet writer fixes for handling large datasets - - * Stability improvements in ODBC and JDBC drivers - -Apache Drill 0.7.0 Key Notes and Limitations - - * The current release supports in-memory and beyond-memory execution. However, you must disable memory-intensive hash aggregate and hash join operations to leverage this functionality. - * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior, such as Sort. Other operations, such as streaming aggregate, may have partial support that leads to unexpected results. - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/001-data-types.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/001-data-types.md b/_docs/drill-docs/sql-ref/001-data-types.md deleted file mode 100644 index 1f0ba0e..0000000 --- a/_docs/drill-docs/sql-ref/001-data-types.md +++ /dev/null @@ -1,96 +0,0 @@ ---- -title: "Data Types" -parent: "SQL Reference" ---- -You can use the following SQL data types in your Drill queries: - -#### Binary Types - - * BYTEA - * VARBINARY - -#### Character Types - - * CHAR - * VARCHAR - * TEXT - -#### Date/Time Data Types - - * DATE - * INTERVAL - * Interval Year (stores year and month) - - * Interval Day (stores day, hour, minute, seconds, and milliseconds) - * TIME - * TIMESTAMP - -Refer to [Supported Date/Time Data Type formats](https://cwiki.apache.org/conf -luence/pages/viewpage.action?pageId=44304039). - -#### Integer - - * BIGINT - - * INT - * SMALLINT - - * TINYINT - -#### Numeric Types - - * DECIMAL - * FLOAT - * DOUBLE PRECISION (FLOAT 8) - * REAL (FLOAT 4) - -#### Boolean Types - - * FALSE - * NULL - * TRUE - -# Complex Data Types - -Drill provides map and array data types to work with complex and nested data -structures. For analysis of complex data, a more modern JSON-style approach to -writing queries is more effective than using standard SQL functions. - -The following table provides descriptions and examples of the complex data -types: - -<table><tbody> - <tr><th>Data Type</th> - <th>Description</th> - <th>Example</th></tr> - <tr> - <td valign="top">Map</td> - <td valign="top">A map is a set of name/value pairs. </br> - A value in an map can be a scalar type, </br> - such as string or int, or a map can be a </br> - complex type, such as an array or another map.</td> - <td valign="top">Map with scalar type values:</br><code> "phoneNumber": { "areaCode": "622", "number": "1567845"}</code></br>Map with complex type value:<code></br> { "citiesLived" : [ { "place" : "Los Angeles",</br> - "yearsLived" : [ "1989",</br> - "1993",</br> - "1998",</br> - "2002"</br> - ]</br> - - } ] }</code></td> - </tr> - <tr> - <td valign="top">Array</td> - <td valign="top">An array is a repeated list of values. </br> - A value in an array can be a scalar type, </br> - such as string or int, or an array can be a</br> - complex type, such as a map or another array.</td> - <td valign="top">Array with scalar values:</br><code> "yearsLived": ["1990", "1993", "1998", "2008"]</code></br>Array with complex type values:</br><code> "children":</br> [ { "age" : "10", </br> "gender" : "Male",</br> "name" : "Earl"</br> }, </br> { "age" : "6",</br> "gender" : "Male",</br> "name" : "Sam"</br> },</br> { "age" : "8",</br> "gender" : "Male", </br> "name" : "Kit" </br>   ;}</br> ]</code></td> - </tr> - </tbody></table> - - -<!--| Data Type | Description | Example | -| --------- | ----------- | --------| -| Map | A map is a set of name/value pairs. A value in an map can be a scalar type, such as string or int, or a map can be a complex type, such as an array or another map.| Map with scalar type values: "phoneNumber": { "areaCode": "622", "number": "1567845"} Map with complex type value :{ "citiesLived" : [ { "place" : "Los Angeles", "yearsLived" : [ "1989", "1993", "1998", "2002"] } ] } | -|Array| An array is a repeated list of values. A value in an array can be a scalar type, such as string or int, or an array can be a complex type, such as a map or another array.| Array with scalar type values: "yearsLived": ["1990"", "1993", "1998", "2008"] Array with complex type value :"children" : [ { "age" : "10", "gender" : "Male", "name": "Earl" }, {"age" : "6", "gender": "Male", "name": "Sam"}, { "age" : "8", "gender" : "Male", "name" : "Kit" }] | ---> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/002-operators.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/002-operators.md b/_docs/drill-docs/sql-ref/002-operators.md deleted file mode 100644 index 763b809..0000000 --- a/_docs/drill-docs/sql-ref/002-operators.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -title: "Operators" -parent: "SQL Reference" ---- -You can use various types of operators in your Drill queries to perform -operations on your data. - -## Logical Operators - -You can use the following logical operators in your Drill queries: - - * AND - * BETWEEN - * IN - * LIKE - * NOT - * OR - -## Comparison Operators - -You can use the following comparison operators in your Drill queries: - - * < - * \> - * <= - * \>= - * = - * <> - * IS NULL - * IS NOT NULL - * IS FALSE - * IS NOT FALSE - * IS TRUE - * IS NOT TRUE - -## Pattern Matching Operators - -You can use the following pattern matching operators in your Drill queries: - - * LIKE - * NOT LIKE - * SIMILAR TO - * NOT SIMILAR TO - -## Math Operators - -You can use the following math operators in your Drill queries: - -**Operator**| **Description** ----|--- -+| Addition --| Subtraction -*| Multiplication -/| Division - -## Subquery Operators - -You can use the following subquery operators in your Drill queries: - - * EXISTS - * IN - -See [SELECT Statements](https://cwiki.apache.org/confluence/display/DRILL/SELE -CT+Statements). - -## String Operators - -You can use the following string operators in your Drill queries: - - * string || string - * string || non-string or non-string || string \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/003-functions.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/003-functions.md b/_docs/drill-docs/sql-ref/003-functions.md deleted file mode 100644 index 2494b0e..0000000 --- a/_docs/drill-docs/sql-ref/003-functions.md +++ /dev/null @@ -1,185 +0,0 @@ ---- -title: "SQL Functions" -parent: "SQL Reference" ---- -You can use the following types of functions in your Drill queries: - - * Scalar Functions - * Aggregate Functions - * Aggregate Statistics Functions - * Convert Functions - * Nested Data Functions - -## Scalar Functions - -### Math - -You can use the following scalar math functions in your Drill queries: - - * ABS - * CEIL - * CEILING - * DIV - * FLOOR - * MOD - * POWER - * RANDOM - * ROUND - * SIGN - * SQRT - * TRUNC - -### String Functions - -The following table provides the string functions that you can use in your -Drill queries: - -Function| Return Type ---------|--- -char_length(string) or character_length(string)| int -concat(str "any" [, str "any" [, ...] ])| text -convert_from(string bytea, src_encoding name)| text -convert_to(string text, dest_encoding name)| bytea -initcap(string)| text -left(str text, n int)| text -length(string)| int -length(string bytes, encoding name )| int -lower(string)| text -lpad(string text, length int [, fill text])| text -ltrim(string text [, characters text])| text -position(substring in string)| int -regexp_replace(string text, pattern text, replacement text [, flags text])|text -replace(string text, from text, to text)| text -right(str text, n int)| text -rpad(string text, length int [, fill text])| text -rtrim(string text [, characters text])| text -strpos(string, substring)| int -substr(string, from [, count])| text -substring(string [from int] [for int])| text -trim([leading | trailing | both] [characters] from string)| text -upper(string)| text - - -### Date/Time Functions - -The following table provides the date/time functions that you can use in your -Drill queries: - -**Function**| **Return Type** ----|--- -current_date| date -current_time| time with time zone -current_timestamp| timestamp with time zone -date_add(date,interval expr type)| date/datetime -date_part(text, timestamp)| double precision -date_part(text, interval)| double precision -date_sub(date,INTERVAL expr type)| date/datetime -extract(field from interval)| double precision -extract(field from timestamp)| double precision -localtime| time -localtimestamp| timestamp -now()| timestamp with time zone -timeofday()| text - -### Data Type Formatting Functions - -The following table provides the data type formatting functions that you can -use in your Drill queries: - -**Function**| **Return Type** ----|--- -to_char(timestamp, text)| text -to_char(int, text)| text -to_char(double precision, text)| text -to_char(numeric, text)| text -to_date(text, text)| date -to_number(text, text)| numeric -to_timestamp(text, text)| timestamp with time zone -to_timestamp(double precision)| timestamp with time zone - -## Aggregate Functions - -The following table provides the aggregate functions that you can use in your -Drill queries: - -**Function** | **Argument Type** | **Return Type** - -------- | ------------- | ----------- -avg(expression)| smallint, int, bigint, real, double precision, numeric, or interval| numeric for any integer-type argument, double precision for a floating-point argument, otherwise the same as the argument data type -count(*)| _-_| bigint -count([DISTINCT] expression)| any| bigint -max(expression)| any array, numeric, string, or date/time type| same as argument type -min(expression)| any array, numeric, string, or date/time type| same as argument type -sum(expression)| smallint, int, bigint, real, double precision, numeric, or interval| bigint for smallint or int arguments, numeric for bigint arguments, double precision for floating-point arguments, otherwise the same as the argument data type - - -## Aggregate Statistics Functions - -The following table provides the aggregate statistics functions that you can use in your Drill queries: - -**Function**| **Argument Type**| **Return Type** - -------- | ------------- | ----------- -stddev(expression)| smallint, int, bigint, real, double precision, or numeric| double precision for floating-point arguments, otherwise numeric -stddev_pop(expression)| smallint, int, bigint, real, double precision, or numeric| double precision for floating-point arguments, otherwise numeric -stddev_samp(expression)| smallint, int, bigint, real, double precision, or numeric| double precision for floating-point arguments, otherwise numeric -variance(expression)| smallint, int, bigint, real, double precision, or numeric| double precision for floating-point arguments, otherwise numeric -var_pop(expression)| smallint, int, bigint, real, double precision, or numeric| double precision for floating-point arguments, otherwise numeric -var_samp(expression)| smallint, int, bigint, real, double precision, or numeric| double precision for floating-point arguments, otherwise numeric - - -## Convert Functions - -You can use the CONVERT_TO and CONVERT_FROM functions to encode and decode -data when you query your data sources with Drill. For example, HBase stores -data as encoded byte arrays (VARBINARY data). When you issue a query with the -CONVERT_FROM function on HBase, Drill decodes the data and converts it to the -specified data type. In instances where Drill sends data back to HBase during -a query, you can use the CONVERT_TO function to change the data type to bytes. - -Although you can achieve the same results by using the CAST function for some -data types (such as VARBINARY to VARCHAR conversions), in general it is more -efficient to use CONVERT functions when your data sources return binary data. -When your data sources return more conventional data types, you can use the -CAST function. - -The following table provides the data types that you use with the CONVERT_TO -and CONVERT_FROM functions: - -**Type**| **Input Type**| **Output Type** ----|---|--- -BOOLEAN_BYTE| bytes(1)| boolean -TINYINT_BE| bytes(1)| tinyint -TINYINT| bytes(1)| tinyint -SMALLINT_BE| bytes(2)| smallint -SMALLINT| bytes(2)| smallint -INT_BE| bytes(4)| int -INT| bytes(4)| int -BIGINT_BE| bytes(8)| bigint -BIGINT| bytes(8)| bigint -FLOAT| bytes(4)| float (float4) -DOUBLE| bytes(8)| double (float8) -INT_HADOOPV| bytes(1-9)| int -BIGINT_HADOOPV| bytes(1-9)| bigint -DATE_EPOCH_BE| bytes(8)| date -DATE_EPOCH| bytes(8)| date -TIME_EPOCH_BE| bytes(8)| time -TIME_EPOCH| bytes(8)| time -UTF8| bytes| varchar -UTF16| bytes| var16char -UINT8| bytes(8)| uint8 - -A common use case for CONVERT_FROM is when a data source embeds complex data -inside a column. For example, you may have an HBase or MapR-DB table with -embedded JSON data: - - select CONVERT_FROM(col1, 'JSON') - FROM hbase.table1 - ... - -## Nested Data Functions - -This section contains descriptions of SQL functions that you can use to -analyze nested data: - - * [FLATTEN Function](/confluence/display/DRILL/FLATTEN+Function) - * [KVGEN Function](/confluence/display/DRILL/KVGEN+Function) - * [REPEATED_COUNT Function](/confluence/display/DRILL/REPEATED_COUNT+Function) \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/004-nest-functions.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/004-nest-functions.md b/_docs/drill-docs/sql-ref/004-nest-functions.md deleted file mode 100644 index a07dd51..0000000 --- a/_docs/drill-docs/sql-ref/004-nest-functions.md +++ /dev/null @@ -1,10 +0,0 @@ ---- -title: "Nested Data Functions" -parent: "SQL Reference" ---- -This section contains descriptions of SQL functions that you can use to -analyze nested data: - - * [FLATTEN Function](/confluence/display/DRILL/FLATTEN+Function) - * [KVGEN Function](/confluence/display/DRILL/KVGEN+Function) - * [REPEATED_COUNT Function](/confluence/display/DRILL/REPEATED_COUNT+Function) \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/005-cmd-summary.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/005-cmd-summary.md b/_docs/drill-docs/sql-ref/005-cmd-summary.md deleted file mode 100644 index bd3c73e..0000000 --- a/_docs/drill-docs/sql-ref/005-cmd-summary.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -title: "SQL Commands Summary" -parent: "SQL Reference" ---- -The following table provides a list of the SQL commands that Drill supports, -with their descriptions and example syntax: - -<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th class="confluenceTh">Command</th><th class="confluenceTh">Description</th><th class="confluenceTh">Syntax</th></tr><tr><td valign="top" class="confluenceTd">ALTER SESSION</td><td valign="top" class="confluenceTd">Changes a system setting for the duration of a session. A session ends when you quit the Drill shell. For a list of Drill options and their descriptions, refer to <a href="https://cwiki.apache.org/confluence/display/DRILL/Planning+and+Execution+Options" rel="nofollow">Planning and Execution Options</a>.</td><td valign="top" class="confluenceTd"><code>ALTER SESSION SET `option_name`='string';<br />ALTER SESSION SET `option_name`=TRUE | FALSE;</code></td></tr><tr><td valign="top" class="confluenceTd">ALTER SYSTEM</td><td valign="top" class="confluenceTd">Permanently changes a system setting. The new settings persist across all sessions. For a list of Drill options and their descriptions, refer to <a href="h ttps://cwiki.apache.org/confluence/display/DRILL/Planning+and+Execution+Options" rel="nofollow">Planning and Execution Options</a>.</td><td valign="top" class="confluenceTd"><code>ALTER SYSTEM `option_name`='string'<br />ALTER SYSTEM `option_name`=TRUE | FALSE;</code></td></tr><tr><td valign="top" class="confluenceTd"><p><a href="/confluence/display/DRILL/CREATE+TABLE+AS+%28CTAS%29+Command">CREATE TABLE AS<br />(CTAS)</a></p></td><td valign="top" class="confluenceTd"><p>Creates a new table and populates the new table with rows returned from a SELECT query. Use the CREATE TABLE AS (CTAS) statement in place of INSERT INTO. When you issue the CTAS command, you create a directory that contains parquet or CSV files. Each workspace in a file system has a default file type.</p><p>You can specify which writer you want Drill to use when creating a table: parquet, CSV, or JSON (as specified with <span style="line-height: 1.4285715;">the </span><code>store.format</code><span style="line-height : 1.4285715;"> option<span><span>).</span></span></span></p></td><td valign="top" class="confluenceTd"><code>CREATE TABLE new_table_name AS <query>;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">CREATE VIEW</td><td valign="top" colspan="1" class="confluenceTd"><p>Creates a new view based on the results of a SELECT query.</p></td><td valign="top" colspan="1" class="confluenceTd"><code>CREATE VIEW view_name [(column_list)] AS <query>;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">DROP VIEW</td><td valign="top" colspan="1" class="confluenceTd">Removes one or more views.</td><td valign="top" colspan="1" class="confluenceTd"><code>DROP VIEW view_name [, <em class="replaceable">view_name</em>] ...; </code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><a href="https://cwiki.apache.org/confluence/display/DRILL/EXPLAIN+Commands" rel="nofollow">EXPLAIN PLAN FOR</a></td><td valign="top" colspan="1" class="conflue nceTd">Returns the physical plan for a particular query.</td><td valign="top" colspan="1" class="confluenceTd"><code>EXPLAIN PLAN FOR <query>;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><a href="https://cwiki.apache.org/confluence/display/DRILL/EXPLAIN+Commands" rel="nofollow">EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR</a></td><td valign="top" colspan="1" class="confluenceTd">Returns the logical plan for a particular query.</td><td valign="top" colspan="1" class="confluenceTd"><code>EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR <query>;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><a href="https://cwiki.apache.org/confluence/display/DRILL/SELECT+Statements" rel="nofollow">SELECT</a></td><td valign="top" colspan="1" class="confluenceTd">Retrieves data from tables and files.</td><td valign="top" colspan="1" class="confluenceTd"><code>[WITH subquery]<br />SELECT column_list FROM table_name <br />[WHERE clause]<br />[GROUP BY clause ]<br />[HAVING clause]<br />[ORDER BY clause];</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">SHOW DATABASES</td><td valign="top" colspan="1" class="confluenceTd">Returns a list of available schemas. Equivalent to SHOW SCHEMAS.</td><td valign="top" colspan="1" class="confluenceTd"><code>SHOW DATABASES;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><a href="https://cwiki.apache.org/confluence/display/DRILL/SHOW+FILES+Command" rel="nofollow">SHOW FILES</a></td><td valign="top" colspan="1" class="confluenceTd">Returns a list of files in a file system schema.</td><td valign="top" colspan="1" class="confluenceTd"><code>SHOW FILES IN filesystem.`schema_name`;<br />SHOW FILES FROM filesystem.`schema_name`;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">SHOW SCHEMAS</td><td valign="top" colspan="1" class="confluenceTd">Returns a list of available schemas. Equivalent to SHOW DATABASES.</td><td valign="top" colspan="1" clas s="confluenceTd"><code>SHOW SCHEMAS;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">SHOW TABLES</td><td valign="top" colspan="1" class="confluenceTd">Returns a list of tables for all schemas. Optionally, you can first issue the <code>USE </code>command to identify the schema for which you want to view tables.<br />For example, the following <code>USE</code> statement tells Drill that you only want information from the <code>hive.default</code> schema:<br /><code>USE hive.`default`;</code></td><td valign="top" colspan="1" class="confluenceTd"><code>SHOW TABLES;</code></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">USE</td><td valign="top" colspan="1" class="confluenceTd">Change to a particular schema. When you opt to use a particular schema, Drill issues queries on that schema only.</td><td valign="top" colspan="1" class="confluenceTd"><code>USE schema_name;</code></td></tr></tbody></table></div> - -Click on any of the following links for more information about the various -commands: - - * [CREATE TABLE AS (CTAS) Command](/confluence/display/DRILL/CREATE+TABLE+AS+%28CTAS%29+Command) - * [EXPLAIN Commands](/confluence/display/DRILL/EXPLAIN+Commands) - * [SELECT Statements](/confluence/display/DRILL/SELECT+Statements) - * [SHOW FILES Command](/confluence/display/DRILL/SHOW+FILES+Command) \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/006-reserved-wds.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/006-reserved-wds.md b/_docs/drill-docs/sql-ref/006-reserved-wds.md deleted file mode 100644 index 6df4360..0000000 --- a/_docs/drill-docs/sql-ref/006-reserved-wds.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -title: "Reserved Keyswords" -parent: "SQL Reference" ---- -When you use a reserved keyword in a Drill query, enclose the word in -backticks. For example, if you issue the following query to Drill, -you must include backticks around the word TABLES because TABLES is a reserved -keyword: - -``SELECT * FROM INFORMATION_SCHEMA.`TABLES`;`` - -The following table provides the Drill reserved keywords that require back -ticks: - -<div class="table-wrap"><table class="confluenceTable"><tbody><tr><td valign="top" class="confluenceTd"><h1 id="ReservedKeywords-A">A</h1><p>ABS<br />ALL<br />ALLOCATE<br />ALLOW<br />ALTER<br />AND<br />ANY<br />ARE<br />ARRAY<br />AS<br />ASENSITIVE<br />ASYMMETRIC<br />AT<br />ATOMIC<br />AUTHORIZATION<br />AVG</p><h1 id="ReservedKeywords-B">B</h1><p>BEGIN<br />BETWEEN<br />BIGINT<br />BINARY<br />BIT<br />BLOB<br />BOOLEAN<br />BOTH<br />BY</p><h1 id="ReservedKeywords-C">C</h1><p>CALL<br />CALLED<br />CARDINALITY<br />CASCADED<br />CASE<br />CAST<br />CEIL<br />CEILING<br />CHAR<br />CHARACTER<br />CHARACTER_LENGTH<br />CHAR_LENGTH<br />CHECK<br />CLOB<br />CLOSE<br />COALESCE<br />COLLATE<br />COLLECT<br />COLUMN<br />COMMIT<br />CONDITION<br />CONNECT<br />CONSTRAINT<br />CONVERT<br />CORR<br />CORRESPONDING<br />COUNT<br />COVAR_POP<br />COVAR_SAMP<br />CREATE<br />CROSS<br />CUBE<br />CUME_DIST<br />CURRENT<br />CURRENT_CATALOG<br />CURRENT_DATE<br />CURRENT_DEFAULT_TRANSFOR M_GROUP<br />CURRENT_PATH<br />CURRENT_ROLE<br />CURRENT_SCHEMA<br />CURRENT_TIME<br />CURRENT_TIMESTAMP<br />CURRENT_TRANSFORM_GROUP_FOR_TYPE<br />CURRENT_USER<br />CURSOR<br />CYCLE</p></td><td valign="top" class="confluenceTd"><h1 id="ReservedKeywords-D">D</h1><p>DATABASES<br />DATE<br />DAY<br />DEALLOCATE<br />DEC<br />DECIMAL<br />DECLARE<br />DEFAULT<br />DEFAULT_KW<br />DELETE<br />DENSE_RANK<br />DEREF<br />DESCRIBE<br />DETERMINISTIC<br />DISALLOW<br />DISCONNECT<br />DISTINCT<br />DOUBLE<br />DROP<br />DYNAMIC</p><h1 id="ReservedKeywords-E">E</h1><p>EACH<br />ELEMENT<br />ELSE<br />END<br />END_EXEC<br />ESCAPE<br />EVERY<br />EXCEPT<br />EXEC<br />EXECUTE<br />EXISTS<br />EXP<br />EXPLAIN<br />EXTERNAL<br />EXTRACT</p><h1 id="ReservedKeywords-F">F</h1><p>FALSE<br />FETCH<br />FILES<br />FILTER<br />FIRST_VALUE<br />FLOAT<br />FLOOR<br />FOR<br />FOREIGN<br />FREE<br />FROM<br />FULL<br />FUNCTION<br />FUSION</p><h1 id="ReservedKeywords-G">G</h1><p>GET<br />GLOBAL<br />GR ANT<br />GROUP<br />GROUPING</p><h1 id="ReservedKeywords-H">H</h1><p>HAVING<br />HOLD<br />HOUR</p></td><td valign="top" class="confluenceTd"><h1 id="ReservedKeywords-I">I</h1><p>IDENTITY<br />IMPORT<br />IN<br />INDICATOR<br />INNER<br />INOUT<br />INSENSITIVE<br />INSERT<br />INT<br />INTEGER<br />INTERSECT<br />INTERSECTION<br />INTERVAL<br />INTO<br />IS</p><h1 id="ReservedKeywords-J">J</h1><p>JOIN</p><h1 id="ReservedKeywords-L">L</h1><p>LANGUAGE<br />LARGE<br />LAST_VALUE<br />LATERAL<br />LEADING<br />LEFT<br />LIKE<br />LIMIT<br />LN<br />LOCAL<br />LOCALTIME<br />LOCALTIMESTAMP<br />LOWER</p><h1 id="ReservedKeywords-M">M</h1><p>MATCH<br />MAX<br />MEMBER<br />MERGE<br />METHOD<br />MIN<br />MINUTE<br />MOD<br />MODIFIES<br />MODULE<br />MONTH<br />MULTISET</p><h1 id="ReservedKeywords-N">N</h1><p>NATIONAL<br />NATURAL<br />NCHAR<br />NCLOB<br />NEW<br />NO<br />NONE<br />NORMALIZE<br />NOT<br />NULL<br />NULLIF<br />NUMERIC</p><h1 id="ReservedKeywords-O">O</h1><p>OCTET_LENGTH <br />OF<br />OFFSET<br />OLD<br />ON<br />ONLY<br />OPEN<br />OR<br />ORDER<br />OUT<br />OUTER<br />OVER<br />OVERLAPS<br />OVERLAY</p></td><td valign="top" colspan="1" class="confluenceTd"><h1 id="ReservedKeywords-P">P</h1><p>PARAMETER<br />PARTITION<br />PERCENTILE_CONT<br />PERCENTILE_DISC<br />PERCENT_RANK<br />POSITION<br />POWER<br />PRECISION<br />PREPARE<br />PRIMARY<br />PROCEDURE</p><h1 id="ReservedKeywords-R">R</h1><p>RANGE<br />RANK<br />READS<br />REAL<br />RECURSIVE<br />REF<br />REFERENCES<br />REFERENCING<br />REGR_AVGX<br />REGR_AVGY<br />REGR_COUNT<br />REGR_INTERCEPT<br />REGR_R2<br />REGR_SLOPE<br />REGR_SXX<br />REGR_SXY<br />RELEASE<br />REPLACE<br />RESULT<br />RETURN<br />RETURNS<br />REVOKE<br />RIGHT<br />ROLLBACK<br />ROLLUP<br />ROW<br />ROWS<br />ROW_NUMBER</p><h1 id="ReservedKeywords-S">S</h1><p>SAVEPOINT<br />SCHEMAS<br />SCOPE<br />SCROLL<br />SEARCH<br />SECOND<br />SELECT<br />SENSITIVE<br />SESSION_USER<br />SET<br />SHOW<br />SIMILAR<br />SMALLI NT<br />SOME<br />SPECIFIC<br />SPECIFICTYPE<br />SQL<br />SQLEXCEPTION<br />SQLSTATE<br />SQLWARNING<br />SQRT<br />START<br />STATIC<br />STDDEV_POP<br />STDDEV_SAMP<br />SUBMULTISET<br />SUBSTRING<br />SUM<br />SYMMETRIC<br />SYSTEM<br />SYSTEM_USER</p></td><td valign="top" colspan="1" class="confluenceTd"><h1 id="ReservedKeywords-T">T</h1><p>TABLE<br />TABLES<br />TABLESAMPLE<br />THEN<br />TIME<br />TIMESTAMP<br />TIMEZONE_HOUR<br />TIMEZONE_MINUTE<br />TINYINT<br />TO<br />TRAILING<br />TRANSLATE<br />TRANSLATION<br />TREAT<br />TRIGGER<br />TRIM<br />TRUE</p><h1 id="ReservedKeywords-U">U</h1><p>UESCAPE<br />UNION<br />UNIQUE<br />UNKNOWN<br />UNNEST<br />UPDATE<br />UPPER<br />USE<br />USER<br />USING</p><h1 id="ReservedKeywords-V">V</h1><p>VALUE<br />VALUES<br />VARBINARY<br />VARCHAR<br />VARYING<br />VAR_POP<br />VAR_SAMP</p><h1 id="ReservedKeywords-W">W</h1><p>WHEN<br />WHENEVER<br />WHERE<br />WIDTH_BUCKET<br />WINDOW<br />WITH<br />WITHIN<br />WITHOUT</p><h1 id="Reserve dKeywords-Y">Y</h1><p>YEAR</p></td></tr></tbody></table></div> - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/cmd-summary/001-create-table-as.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/cmd-summary/001-create-table-as.md b/_docs/drill-docs/sql-ref/cmd-summary/001-create-table-as.md deleted file mode 100644 index 573c295..0000000 --- a/_docs/drill-docs/sql-ref/cmd-summary/001-create-table-as.md +++ /dev/null @@ -1,134 +0,0 @@ ---- -title: "CREATE TABLE AS (CTAS) command" -parent: "SQL Commands Summary" ---- -You can create tables in Drill by using the CTAS command: - - CREATE TABLE new_table_name AS <query>; - -where query is any valid Drill query. Each table you create must have a unique -name. You can include an optional column list for the new table. For example: - - create table logtable(transid, prodid) as select transaction_id, product_id from ... - -You can store table data in one of three formats: - - * csv - * parquet - * json - -The parquet and json formats can be used to store complex data. - -To set the output format for a Drill table, set the `store.format` option with -the ALTER SYSTEM or ALTER SESSION command. For example: - - alter session set `store.format`='json'; - -Table data is stored in the location specified by the workspace that is in use -when you run the CTAS statement. By default, a directory is created, using the -exact table name specified in the CTAS statement. A .json, .csv, or .parquet -file inside that directory contains the data. - -You can only create new tables in workspaces. You cannot create tables in -other storage plugins such as Hive and HBase. - -You must use a writable (mutable) workspace when creating Drill tables. For -example: - - "tmp": { - "location": "/tmp", - "writable": true, - } - -## Example - -The following query returns one row from a JSON file: - - 0: jdbc:drill:zk=local> select id, type, name, ppu - from dfs.`/Users/brumsby/drill/donuts.json`; - +------------+------------+------------+------------+ - | id | type | name | ppu | - +------------+------------+------------+------------+ - | 0001 | donut | Cake | 0.55 | - +------------+------------+------------+------------+ - 1 row selected (0.248 seconds) - -To create and verify the contents of a table that contains this row: - - 1. Set the workspace to a writable workspace. - 2. Set the `store.format` option appropriately. - 3. Run a CTAS statement that contains the query. - 4. Go to the directory where the table is stored and check the contents of the file. - 5. Run a query against the new table. - -The following sqlline output captures this sequence of steps. - -### Workspace Definition - - "tmp": { - "location": "/tmp", - "writable": true, - } - -### ALTER SESSION Command - - alter session set `store.format`='json'; - -### USE Command - - 0: jdbc:drill:zk=local> use dfs.tmp; - +------------+------------+ - | ok | summary | - +------------+------------+ - | true | Default schema changed to 'dfs.tmp' | - +------------+------------+ - 1 row selected (0.03 seconds) - -### CTAS Command - - 0: jdbc:drill:zk=local> create table donuts_json as - select id, type, name, ppu from dfs.`/Users/brumsby/drill/donuts.json`; - +------------+---------------------------+ - | Fragment | Number of records written | - +------------+---------------------------+ - | 0_0 | 1 | - +------------+---------------------------+ - 1 row selected (0.107 seconds) - -### File Contents - - administorsmbp7:tmp brumsby$ pwd - /tmp - administorsmbp7:tmp brumsby$ cd donuts_json - administorsmbp7:donuts_json brumsby$ more 0_0_0.json - { - "id" : "0001", - "type" : "donut", - "name" : "Cake", - "ppu" : 0.55 - } - -### Query Against New Table - - 0: jdbc:drill:zk=local> select * from donuts_json; - +------------+------------+------------+------------+ - | id | type | name | ppu | - +------------+------------+------------+------------+ - | 0001 | donut | Cake | 0.55 | - +------------+------------+------------+------------+ - 1 row selected (0.053 seconds) - -### Use a Different Output Format - -You can run the same sequence again with a different storage format set for -the system or session (csv or parquet). For example, if the format is set to -csv, and you name the table donuts_csv, the resulting file would look like -this: - - administorsmbp7:tmp brumsby$ cd donuts_csv - administorsmbp7:donuts_csv brumsby$ ls - 0_0_0.csv - administorsmbp7:donuts_csv brumsby$ more 0_0_0.csv - id,type,name,ppu - 0001,donut,Cake,0.55 - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/cmd-summary/002-explain.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/cmd-summary/002-explain.md b/_docs/drill-docs/sql-ref/cmd-summary/002-explain.md deleted file mode 100644 index 4ca0870..0000000 --- a/_docs/drill-docs/sql-ref/cmd-summary/002-explain.md +++ /dev/null @@ -1,166 +0,0 @@ ---- -title: "EXPLAIN commands" -parent: "SQL Commands Summary" ---- -EXPLAIN is a useful tool for examining the steps that a query goes through -when it is executed. You can use the EXPLAIN output to gain a deeper -understanding of the parallel processing that Drill queries exploit. You can -also look at costing information, troubleshoot performance issues, and -diagnose routine errors that may occur when you run queries. - -Drill provides two variations on the EXPLAIN command, one that returns the -physical plan and one that returns the logical plan. A logical plan takes the -SQL query (as written by the user and accepted by the parser) and translates -it into a logical series of operations that correspond to SQL language -constructs (without defining the specific algorithms that will be implemented -to run the query). A physical plan translates the logical plan into a specific -series of steps that will be used when the query runs. For example, a logical -plan may indicate a join step in general and classify it as inner or outer, -but the corresponding physical plan will indicate the specific type of join -operator that will run, such as a merge join or a hash join. The physical plan -is operational and reveals the specific _access methods_ that will be used for -the query. - -An EXPLAIN command for a query that is run repeatedly under the exact same -conditions against the same data will return the same plan. However, if you -change a configuration option, for example, or update the tables or files that -you are selecting from, you are likely to see plan changes. - -# EXPLAIN Syntax - -The EXPLAIN command supports the following syntax: - - explain plan [ including all attributes ] [ with implementation | without implementation ] for <query> ; - -where `query` is any valid SELECT statement supported by Drill. - -##### INCLUDING ALL ATTRIBUTES - -This option returns costing information. You can use this option for both -physical and logical plans. - -#### WITH IMPLEMENTATION | WITHOUT IMPLEMENTATION - -These options return the physical and logical plan information, respectively. -The default is physical (WITH IMPLEMENTATION). - -# EXPLAIN for Physical Plans - -The EXPLAIN PLAN FOR <query> command returns the chosen physical execution -plan for a query statement without running the query. You can use this command -to see what kind of execution operators Drill implements. For example, you can -find out what kind of join algorithm is chosen when tables or files are -joined. You can also use this command to analyze errors and troubleshoot -queries that do not run. For example, if you run into a casting error, the -query plan text may help you isolate the problem. - -Use the following syntax: - - explain plan for <query> ; - explain plan with implementation for <query> ; - -The following set command increases the default text display (number of -characters). By default, most of the plan output is not displayed. - - 0: jdbc:drill:zk=local> !set maxwidth 10000 - -Do not use a semicolon to terminate set commands. - -For example, here is the top portion of the explain output for a -COUNT(DISTINCT) query on a JSON file: - - 0: jdbc:drill:zk=local> !set maxwidth 10000 - - 0: jdbc:drill:zk=local> explain plan for - select type t, count(distinct id) - from dfs.`/Users/brumsby/drill/donuts.json` - where type='donut' group by type; - - +------------+------------+ - | text | json | - +------------+------------+ - | 00-00 Screen - 00-01 Project(t=[$0], EXPR$1=[$1]) - 00-02 Project(t=[$0], EXPR$1=[$1]) - 00-03 HashAgg(group=[{0}], EXPR$1=[COUNT($1)]) - 00-04 HashAgg(group=[{0, 1}]) - 00-05 SelectionVectorRemover - 00-06 Filter(condition=[=(CAST($0):CHAR(5) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", 'donut')]) - 00-07 Project(type=[$1], id=[$2]) - 00-08 ProducerConsumer - 00-09 Scan(groupscan=[EasyGroupScan [selectionRoot=/Users/brumsby/drill/donuts.json, columns = null]]) - ... - -Read the text output from bottom to top to understand the sequence of -operators that will execute the query. Note that the physical plan starts with -a scan of the JSON file that is being queried. The selected columns are -projected and filtered, then the aggregate function is applied. - -The EXPLAIN text output is followed by detailed JSON output, which is reusable -for submitting the query via Drill APIs. - - | { - "head" : { - "version" : 1, - "generator" : { - "type" : "ExplainHandler", - "info" : "" - }, - "type" : "APACHE_DRILL_PHYSICAL", - "options" : [ ], - "queue" : 0, - "resultMode" : "EXEC" - }, - .... - -# Costing Information - -Add the INCLUDING ALL ATTRIBUTES option to the EXPLAIN command to see cost -estimates for the query plan. For example: - - 0: jdbc:drill:zk=local> !set maxwidth 10000 - 0: jdbc:drill:zk=local> explain plan including all attributes for - select * from dfs.`/Users/brumsby/drill/donuts.json` where type='donut'; - - +------------+------------+ - | text | json | - +------------+------------+ - - | 00-00 Screen: rowcount = 1.0, cumulative cost = {4.1 rows, 14.1 cpu, 0.0 io, 0.0 network}, id = 3110 - 00-01 Project(*=[$0], type=[$1]): rowcount = 1.0, cumulative cost = {4.0 rows, 14.0 cpu, 0.0 io, 0.0 network}, id = 3109 - 00-02 SelectionVectorRemover: rowcount = 1.0, cumulative cost = {3.0 rows, 6.0 cpu, 0.0 io, 0.0 network}, id = 3108 - 00-03 Filter(condition=[=(CAST($1):CHAR(5) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", 'donut')]): rowcount = 1.0, cumulative cost = {2.0 rows, 5.0 cpu, 0.0 io, 0.0 network}, id = 3107 - 00-04 ProducerConsumer: rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network}, id = 3106 - 00-05 Scan(groupscan=[EasyGroupScan [selectionRoot=/Users/brumsby/drill/donuts.json, columns = null]]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 3101 - -# EXPLAIN for Logical Plans - -To return the logical plan for a query (again, without actually running the -query), use the EXPLAIN PLAN WITHOUT IMPLEMENTATION syntax: - - explain plan without implementation for <query> ; - -For example: - - 0: jdbc:drill:zk=local> explain plan without implementation for - select a.id - from dfs.`/Users/brumsby/drill/donuts.json` a, dfs.`/Users/brumsby/drill/moredonuts.json` b - where a.id=b.id; - - +------------+------------+ - | text | json | - +------------+------------+ - | DrillScreenRel - DrillProjectRel(id=[$1]) - DrillJoinRel(condition=[=($1, $3)], joinType=[inner]) - DrillScanRel(table=[[dfs, /Users/brumsby/drill/donuts.json]], groupscan=[EasyGroupScan [selectionRoot=/Users/brumsby/drill/donuts.json, columns = null]]) - DrillScanRel(table=[[dfs, /Users/brumsby/drill/moredonuts.json]], groupscan=[EasyGroupScan [selectionRoot=/Users/brumsby/drill/moredonuts.json, columns = null]]) - | { - "head" : { - "version" : 1, - "generator" : { - "type" : "org.apache.drill.exec.planner.logical.DrillImplementor", - "info" : "" - }, - ... - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/cmd-summary/003-select.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/cmd-summary/003-select.md b/_docs/drill-docs/sql-ref/cmd-summary/003-select.md deleted file mode 100644 index 57fc0cf..0000000 --- a/_docs/drill-docs/sql-ref/cmd-summary/003-select.md +++ /dev/null @@ -1,85 +0,0 @@ ---- -title: "SELECT Statements" -parent: "SQL Commands Summary" ---- -Drill supports the following ANSI standard clauses in the SELECT statement: - - * WITH clause - * SELECT list - * FROM clause - * WHERE clause - * GROUP BY clause - * HAVING clause - * ORDER BY clause (with an optional LIMIT clause) - -You can use the same SELECT syntax in the following commands: - - * CREATE TABLE AS (CTAS) - * CREATE VIEW - -INSERT INTO SELECT is not yet supported. - -# Column Aliases - -You can use named column aliases in the SELECT list to provide meaningful -names for regular columns and computed columns, such as the results of -aggregate functions. See the section on running queries for examples. - -You cannot reference column aliases in the following clauses: - - * WHERE - * GROUP BY - * HAVING - -Because Drill works with schema-less data sources, you cannot use positional -aliases (1, 2, etc.) to refer to SELECT list columns, except in the ORDER BY -clause. - -# UNION ALL Set Operator - -Drill supports the UNION ALL set operator to combine two result sets. The -distinct UNION operator is not yet supported. - -The EXCEPT, EXCEPT ALL, INTERSECT, and INTERSECT ALL operators are not yet -supported. - -# Joins - -Drill supports ANSI standard joins in the FROM and WHERE clauses: - - * Inner joins - * Left, full, and right outer joins - -The following types of join syntax are supported: - -Join type| Syntax ----|--- -Join condition in WHERE clause|FROM table1, table 2 WHERE table1.col1=table2.col1 -USING join in FROM clause|FROM table1 JOIN table2 USING(col1, ...) -ON join in FROM clause|FROM table1 JOIN table2 ON table1.col1=table2.col1 -NATURAL JOIN in FROM clause|FROM table 1 NATURAL JOIN table 2 - -Cross-joins are not yet supported. You must specify a join condition when more -than one table is listed in the FROM clause. - -Non-equijoins are supported if the join also contains an equality condition on -the same two tables as part of a conjunction: - - table1.col1 = table2.col1 AND table1.c2 < table2.c2 - -This restriction applies to both inner and outer joins. - -# Subqueries - -You can use the following subquery operators in Drill queries. These operators -all return Boolean results. - - * ALL - * ANY - * EXISTS - * IN - * SOME - -In general, correlated subqueries are supported. EXISTS and NOT EXISTS -subqueries that do not contain a correlation join are not yet supported. - http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/cmd-summary/004-show-files.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/cmd-summary/004-show-files.md b/_docs/drill-docs/sql-ref/cmd-summary/004-show-files.md deleted file mode 100644 index d0f5a83..0000000 --- a/_docs/drill-docs/sql-ref/cmd-summary/004-show-files.md +++ /dev/null @@ -1,65 +0,0 @@ ---- -title: "SHOW FILES Command" -parent: "SQL Commands Summary" ---- -The SHOW FILES command provides a quick report of the file systems that are -visible to Drill for query purposes. This command is unique to Apache Drill. - -# Syntax - -The SHOW FILES command supports the following syntax. - - SHOW FILES [ FROM filesystem.directory_name | IN filesystem.directory_name ]; - -The FROM or IN clause is required if you do not specify a default file system -first. You can do this with the USE command. FROM and IN are synonyms. - -The directory name is optional. (If the directory name is a Drill reserved -word, you must use back ticks around the name.) - -The command returns standard Linux `stat` information for each file or -directory, such as permissions, owner, and group values. This information is -not specific to Drill. - -# Examples - -The following example returns information about directories and files in the -local (`dfs`) file system. - - 0: jdbc:drill:> use dfs; - - +------------+------------+ - | ok | summary | - +------------+------------+ - | true | Default schema changed to 'dfs' | - +------------+------------+ - 1 row selected (0.318 seconds) - - 0: jdbc:drill:> show files; - +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+ - | name | isDirectory | isFile | length | owner | group | permissions | accessTime | modificationTime | - +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+ - | user | true | false | 1 | mapr | mapr | rwxr-xr-x | 2014-07-30 21:37:06.0 | 2014-07-31 22:15:53.193 | - | backup.tgz | false | true | 36272 | root | root | rw-r--r-- | 2014-07-31 22:09:13.0 | 2014-07-31 22:09:13.211 | - | JSON | true | false | 1 | root | root | rwxr-xr-x | 2014-07-31 15:22:42.0 | 2014-08-04 15:43:07.083 | - | scripts | true | false | 3 | root | root | rwxr-xr-x | 2014-07-31 22:10:51.0 | 2014-08-04 18:23:09.236 | - | temp | true | false | 2 | root | root | rwxr-xr-x | 2014-08-01 20:07:37.0 | 2014-08-01 20:09:42.595 | - | hbase | true | false | 10 | mapr | mapr | rwxr-xr-x | 2014-07-30 21:36:08.0 | 2014-08-04 18:31:13.778 | - | tables | true | false | 0 | root | root | rwxrwxrwx | 2014-07-31 22:14:35.0 | 2014-08-04 15:42:43.415 | - | CSV | true | false | 4 | root | root | rwxrwxrwx | 2014-07-31 17:34:53.0 | 2014-08-04 - ... - -The following example shows the files in a specific directory in the `dfs` -file system: - - 0: jdbc:drill:> show files in dfs.CSV; - - +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+ - | name | isDirectory | isFile | length | owner | group | permissions | accessTime | modificationTime | - +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+ - | customers.csv | false | true | 62011 | root | root | rw-r--r-- | 2014-08-04 18:30:39.0 | 2014-08-04 18:30:39.314 | - | products.csv.small | false | true | 34972 | root | root | rw-r--r-- | 2014-07-31 23:58:42.0 | 2014-07-31 23:59:16.849 | - | products.csv | false | true | 34972 | root | root | rw-r--r-- | 2014-08-01 06:39:34.0 | 2014-08-04 15:58:09.325 | - | products.csv.bad | false | true | 62307 | root | root | rw-r--r-- | 2014-08-04 15:58:02.0 | 2014-08-04 15:58:02.612 | - +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+ - 4 rows selected (0.165 seconds) \ No newline at end of file http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/data-types/001-date.md ---------------------------------------------------------------------- diff --git a/_docs/drill-docs/sql-ref/data-types/001-date.md b/_docs/drill-docs/sql-ref/data-types/001-date.md deleted file mode 100644 index 8bf4230..0000000 --- a/_docs/drill-docs/sql-ref/data-types/001-date.md +++ /dev/null @@ -1,148 +0,0 @@ ---- -title: "Supported Date/Time Data Type Formats" -parent: "Data Types" ---- -You must use supported `date` and `time` formats when you `SELECT` date and -time literals or when you `CAST()` from `VARCHAR `to `date` and `time` data -types. Apache Drill currently supports specific formats for the following -`date` and `time` data types: - - * Date - * Timestamp - * Time - * Interval - * Interval Year - * Interval Day - * Literal - -The following query provides an example of how to `SELECT` a few of the -supported date and time formats as literals: - - select date '2008-2-23', timestamp '2008-1-23 14:24:23', time '10:20:30' from dfs.`/tmp/input.json`; - -The following query provides an example where `VARCHAR` data in a file is -`CAST()` to supported `date `and `time` formats: - - select cast(col_A as date), cast(col_B as timestamp), cast(col_C as time) from dfs.`/tmp/dates.json`; - -`Date`, t`imestamp`, and` time` data types store values in `UTC`. Currently, -Apache Drill does not support `timestamp` with time zone. - -## Date - -Drill supports the `date` data type in the following format: - - YYYY-MM-DD (year-month-date) - -The following table provides some examples for the `date` data type: - - | Use | Example | - | --- | ------- | - |Literal| `select date â2008-2-23â from dfs.`/tmp/input.json`;`| - |`JSON` input | `{"date_col" : "2008-2-23"} - | `CAST` from `VARCHAR`| `` select CAST(date_col as date) as CAST_DATE from dfs.`/tmp/input.json`; ``| - -## Timestamp - -Drill supports the `timestamp` data type in the following format: - - yyyy-MM-dd HH:mm:ss.SSS (year-month-date hour:minute:sec.milliseconds) - -The following table provides some examples for the `timestamp` data type: - -<table> - <tbody> - <tr> - <th>Use</th> - <th>CAST Example</th> - </tr> - <tr> - <td valign="top">Literal</td> - <td valign="top"><code><span style="color: rgb(0,0,0);">select timestamp â2008-2-23 10:20:30.345â, timestamp â2008-2-23 10:20:30â from dfs.`/tmp/input.json`;</span></code> - </td></tr> - <tr> - <td colspan="1" valign="top"><code>JSON</code> Input</td> - <td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">{âtimestamp_colâ: â2008-2-23 15:20:30.345â}<br /></span><span style="color: rgb(0,0,0);">{âtimestamp_colâ: â2008-2-23 10:20:30â}</span></code><span style="color: rgb(0,0,0);">The fractional millisecond component is optional.</span></td> - </tr> - <tr> - <td colspan="1" valign="top"><code>CAST</code> from <code>VARCHAR</code></td> - <td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select cast(timestamp_col as timestamp) from dfs.`/tmp/input.json`; </span></code></td> - </tr> - </tbody> - </table> - -## Time - -Drill supports the `time` data type in the following format: - - HH:mm:ss.SSS (hour:minute:sec.milliseconds) - -The following table provides some examples for the` time` data type: - -<table><tbody><tr> - <th>Use</th> - <th>Example</th> - </tr> - <tr> - <td valign="top">Literal</td> - <td valign="top"><code><span style="color: rgb(0,0,0);">select time â15:20:30â, time â10:20:30.123â from dfs.`/tmp/input.json`;</span></code></td> - </tr> - <tr> - <td colspan="1" valign="top"><code>JSON</code> Input</td> - <td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">{âtime_colâ : â10:20:30.999â}<br /></span><span style="color: rgb(0,0,0);">{âtime_colâ: â10:20:30â}</span></code></td> - </tr> - <tr> - <td colspan="1" valign="top"><code>CAST</code> from <code>VARCHAR</code></td> - <td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select cast(time_col as time) from dfs.`/tmp/input.json`;</span></code></td> -</tr></tbody> -</table> - -## Interval - -Drill supports the `interval year` and `interval day` data types. - -### Interval Year - -The `interval year` data type stores time duration in years and months. Drill -supports the `interval` data type in the following format: - - P [qty] Y [qty] M - -The following table provides examples for `interval year `data type: - -<table ><tbody><tr> -<th>Use</th> -<th>Example</th></tr> - <tr> - <td valign="top">Literals</td> - <td valign="top"><code><span style="color: rgb(0,0,0);">select interval â1-2â year to month from dfs.`/tmp/input.json`;<br /></span><span style="color: rgb(0,0,0);">select interval â1â year from dfs.`/tmp/input.json`;<br /></span><span style="color: rgb(0,0,0);">select interval '13â month from dfs.`/tmp/input.json`;</span></code></td></tr><tr> - <td colspan="1" valign="top"><code>JSON</code> Input</td> - <td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">{âcolâ : âP1Y2Mâ}<br /></span><span style="color: rgb(0,0,0);">{âcolâ : âP-1Y2Mâ}<br /></span><span style="color: rgb(0,0,0);">{âcolâ : âP-1Y-2Mâ}<br /></span><span style="color: rgb(0,0,0);">{âcolâ: âP10Mâ}<br /></span><span style="color: rgb(0,0,0);">{âcolâ: âP5Yâ}</span></code></td> - </tr> - <tr> - <td colspan="1" valign="top"><code>CAST</code> from <code>VARCHAR</code></td> - <td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select cast(col as interval year) from dfs.`/tmp/input.json`;</span></code></td> - </tr> - </tbody></table> - -### Interval Day - -The `interval day` data type stores time duration in days, hours, minutes, and -seconds. You do not need to specify all fields in a given interval. Drill -supports the `interval day` data type in the following format: - - P [qty] D T [qty] H [qty] M [qty] S - -The following table provides examples for `interval day` data type: - -<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th class="confluenceTh">Use</th><th class="confluenceTh">Example</th></tr><tr><td valign="top">Literal</td><td valign="top"><code><span style="color: rgb(0,0,0);">select interval '1 10:20:30.123' day to second from dfs.`/tmp/input.json`;<br /></span><span style="color: rgb(0,0,0);">select interval '1 10' day to hour from dfs.`/tmp/input.json`;<br /></span><span style="color: rgb(0,0,0);">select interval '10' day from dfs.`/tmp/input.json`;<br /></span><span style="color: rgb(0,0,0);">select interval '10' hour from dfs.`/tmp/input.json`;</span></code><code><span style="color: rgb(0,0,0);">select interval '10.999' second from dfs.`/tmp/input.json`;</span></code></td></tr><tr><td colspan="1" valign="top"><code>JSON</code> Input</td><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">{"col" : "P1DT10H20M30S"}<br /></span><span style="color: rgb(0,0,0);">{"col" : "P1DT 10H20M30.123S"}<br /></span><span style="color: rgb(0,0,0);">{"col" : "P1D"}<br /></span><span style="color: rgb(0,0,0);">{"col" : "PT10H"}<br /></span><span style="color: rgb(0,0,0);">{"col" : "PT10.10S"}<br /></span><span style="color: rgb(0,0,0);">{"col" : "PT20S"}<br /></span><span style="color: rgb(0,0,0);">{"col" : "PT10H10S"}</span></code></td></tr><tr><td colspan="1" valign="top"><code>CAST</code> from <code>VARCHAR</code></td><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select cast(col as interval day) from dfs.`/tmp/input.json`;</span></code></td></tr></tbody></table></div> - -## Literal - -The following table provides a list of` date/time` literals that Drill -supports with examples of each: - -<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th class="confluenceTh">Format</th><th colspan="1" class="confluenceTh">Interpretation</th><th class="confluenceTh">Example</th></tr><tr><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">interval '1 10:20:30.123' day to second</span></code></td><td colspan="1" valign="top"><code>1 day, 10 hours, 20 minutes, 30 seconds, and 123 thousandths of a second</code></td><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select interval '1 10:20:30.123' day to second from dfs.`/tmp/input.json`;</span></code></td></tr><tr><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">interval '1 10' day to hour</span></code></td><td colspan="1" valign="top"><code>1 day 10 hours</code></td><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select interval '1 10' day to hour from dfs.`/tmp/input.json`;</span></code></td></tr><tr><td colspan="1" valign="top"><code><span styl e="color: rgb(0,0,0);">interval '10' day</span></code></td><td colspan="1" valign="top"><code>10 days</code></td><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select interval '10' day from dfs.`/tmp/input.json`;</span></code></td></tr><tr><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">interval '10' hour</span></code></td><td colspan="1" valign="top"><code>10 hours</code></td><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select interval '10' hour from dfs.`/tmp/input.json`;</span></code></td></tr><tr><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">interval '10.999' second</span></code></td><td colspan="1" valign="top"><code>10.999 seconds</code></td><td colspan="1" valign="top"><code><span style="color: rgb(0,0,0);">select interval '10.999' second from dfs.`/tmp/input.json`; </span></code></td></tr></tbody></table></div> - - -