This is an automated email from the ASF dual-hosted git repository.
bridgetb pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/drill.git
The following commit(s) were added to refs/heads/gh-pages by this push:
new a084e87 Doc for DRILL-7110, DRILL-7038
a084e87 is described below
commit a084e879f6a3e56c8e9aa027f85eff9afd5552e6
Author: Bridget Bevens <[email protected]>
AuthorDate: Mon Apr 15 19:19:20 2019 -0700
Doc for DRILL-7110, DRILL-7038
---
.../040-querying-directories.md | 56 ++++++++++++++++++----
_docs/sql-reference/sql-commands/010-set.md | 9 +++-
2 files changed, 55 insertions(+), 10 deletions(-)
diff --git a/_docs/query-data/query-a-file-system/040-querying-directories.md
b/_docs/query-data/query-a-file-system/040-querying-directories.md
index c13b083..8cf3e05 100644
--- a/_docs/query-data/query-a-file-system/040-querying-directories.md
+++ b/_docs/query-data/query-a-file-system/040-querying-directories.md
@@ -1,6 +1,6 @@
---
title: "Querying Directories"
-date: 2019-04-05
+date: 2019-04-16
parent: "Querying a File System"
---
You can store multiple files in a directory and query them as if they were a
@@ -60,7 +60,7 @@ records in all of the files inside the `2013` directory:
## Querying Partitioned Directories
You can use special variables in Drill to refer to subdirectories in your
-workspace path:
+workspace path, for example:
* dir0
* dir1
@@ -70,14 +70,12 @@ Note that these variables are dynamically determined based
on the partitioning
of the file system. No up-front definitions are required to identify the
partitions
that exist.
-The following image provides a visual example of a partitioned directory and a
query
+The following image represents a partitioned directory and shows a query
on the directory using variables:

-When you use directory variables in a query, note that the variables are
relative to the root directory used in the FROM clause.
-
-For example, let's say you create a workspace within the dfs storage plugin
named logs (dfs.logs) that points
+When you use directory variables in a query, note that the variables are
relative to the root directory used in the FROM clause. For example, let's say
you create a workspace within the dfs storage plugin named logs (dfs.logs) that
points
to the /tmp directory in the file system. The /tmp directory contains a /logs
directory (/tmp/logs)
with the same subdirectories shown in the example image above. You can query
the data in the /logs directory using variables, as shown in the following
examples:
@@ -108,9 +106,49 @@ with the same subdirectories shown in the example image
above. You can query the
| 1 | \x00*\xE9l\xF2\x19\x00\x00N\x7F%\x00 | 1 | Amanda |
Jordan | [email protected] | Female | 1.197.201.2 |
6759521864920116 | Indonesia | 3/8/1971 | 49756.53 | Internal Auditor |
1E+02 |
| 1 | \x00^0\xD0\xE17\x00\x00N\x7F%\x00 | 2 | Albert |
Freeman | [email protected] | Male | 218.111.175.34 |
| Canada | 1/16/1968 | 150280.17 | Accountant IV | |
| 1 | \x00.\xF9"\xCB\x03\x00\x00N\x7F%\x00 | 3 | Evelyn |
Morgan | [email protected] | Female | 7.161.136.94 |
6767119071901597 | Russia | 2/1/1960 | 144972.51 | Structural Engineer |
|
-
+------+--------------------------------------+----+------------+-----------+-------------------------+--------+----------------+------------------+-----------+-----------+-----------+---------------------+----------+
-
-
+
+------+--------------------------------------+----+------------+-----------+-------------------------+--------+----------------+------------------+-----------+-----------+-----------+---------------------+----------+
+
+Starting in Drill 1.16, Drill uses a Value operator instead of a Scan operator
to read data when a query selects on partitioned columns (dir0, dir1, …dirN)
only and also has a DISTINCT or GROUP BY operation. Instead of scanning all
directory columns, Drill either reads the specified column from the metadata
cache file (if one exists) or Drill selects directly from the directory
(partition location). The presence of the Values operator (instead of the Scan
operator) in the query plan indi [...]
+
+ select distinct dir0 from `/logs`;
+ ------
+ dir0
+ ------
+ 2012
+ 2013
+ 2014
+ ------
+
+ explain plan for select distinct dir0 from `/logs`;
+
------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+ text json
+
------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+ 00-00 Screen
+ 00-01 Project(dir0=[$0])
+ 00-02 StreamAgg(group=[{0}])
+ 00-03 Sort(sort0=[$0], dir0=[ASC])
+ 00-04 Values(tuples=[[{ '2012' }, { '2012' }, { '2013' }, { '2012' }, {
'2014' }, { '2012' }]])
+
+ select dir0 from `/logs` group by dir0;
+ ------
+ | dir0 |
+ ------
+ | 2012 |
+ | 2013 |
+ | 2014 |
+ ------
+
+ explain plan for select dir0 from `/logs` group by dir0;
+
+
------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+ | text | json |
+
------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+ | 00-00 Screen
+ 00-01 Project(dir0=[$0])
+ 00-02 StreamAgg(group=[{0}])
+ 00-03 Sort(sort0=[$0], dir0=[ASC])
+ 00-04 Values(tuples=[[{ '2012' }, { '2012' }, { '2013' }, { '2012' }, {
'2014' }, { '2012' }]])
+
You can use [query directory
functions]({{site.baseurl}}/docs/query-directory-functions/) to restrict a
query to one of a number of subdirectories and to prevent Drill from scanning
all data in directories.
diff --git a/_docs/sql-reference/sql-commands/010-set.md
b/_docs/sql-reference/sql-commands/010-set.md
index f4ae32a..48f219e 100644
--- a/_docs/sql-reference/sql-commands/010-set.md
+++ b/_docs/sql-reference/sql-commands/010-set.md
@@ -1,6 +1,6 @@
---
title: "SET"
-date: 2019-01-07
+date: 2019-04-16
parent: "SQL Commands"
---
Starting in Drill 1.3, the SET command replaces the ALTER SESSION SET command.
The SET command changes a system setting for the duration of a session. Session
level settings override system level settings.
@@ -22,6 +22,13 @@ or float. Use the appropriate value type for each option
that you set.
## Usage Notes
+- Starting in Drill 1.16, Drill no longer writes the profiles for SET queries
to the persistent store. When you run the SET command, you will not see the
activity on the Profiles page of the Drill Web UI. If you want to see query
profiles for the SET command, set the `exec.query_profile.alter_session.skip`
option to false from Drill Web UI or command line, as shown:
+
+ //At the session level
+ set `exec.query_profile.alter_session.skip` = false;
+ //At the system level
+ alter system set `exec.query_profile.alter_session.skip` =
false;
+
- By default, Drill returns a result set when you issue DDL statements, such
as SET. If the client tool from which you connect to Drill (via JDBC) does not
expect a result set when you issue DDL statements, set the
`exec.query.return_result_set_for_ddl` option to false, as shown, to prevent
the client from canceling queries:
SET `exec.query.return_result_set_for_ddl` = false