This is an automated email from the ASF dual-hosted git repository.
bridgetb pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/drill-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 03f3efe doc updates
03f3efe is described below
commit 03f3efe7fc521039bb92ee60e7051b8ee6705f51
Author: Bridget Bevens <[email protected]>
AuthorDate: Thu May 30 20:11:12 2019 -0700
doc updates
---
docs/analyze-table/index.html | 22 +++++++++++-----------
docs/create-or-replace-schema/index.html | 19 ++++++++++---------
.../optimizing-parquet-metadata-reading/index.html | 4 ++--
docs/refresh-table-metadata/index.html | 4 ++--
docs/running-drill-on-docker/index.html | 18 +++++-------------
feed.xml | 4 ++--
6 files changed, 32 insertions(+), 39 deletions(-)
diff --git a/docs/analyze-table/index.html b/docs/analyze-table/index.html
index 21aa15a..7cb9cfb 100644
--- a/docs/analyze-table/index.html
+++ b/docs/analyze-table/index.html
@@ -1320,18 +1320,19 @@
</div>
- May 2, 2019
+ May 31, 2019
<link href="/css/docpage.css" rel="stylesheet" type="text/css">
<div class="int_text" align="left">
- <p>Drill 1.16 and later supports the ANALYZE TABLE statement. The
ANALYZE TABLE statement computes statistics on Parquet data stored in tables
and directories. ANALYZE TABLE writes statistics to a JSON file in the
<code>.stats.drill</code> directory, for example
<code>/user/table1/.stats.drill/0_0.json</code>. The optimizer in Drill uses
these statistics to estimate filter, aggregation, and join cardinalities and
create more efficient query plans. </p>
+ <p>Drill 1.16 and later supports the ANALYZE TABLE statement. The
ANALYZE TABLE statement computes statistics on Parquet data stored in tables
and directories. The optimizer in Drill uses statistics to estimate filter,
aggregation, and join cardinalities and create an optimal query plan.
+ANALYZE TABLE writes statistics to a JSON file in the
<code>.stats.drill</code> directory, for example
<code>/user/table1/.stats.drill/0_0.json</code>. </p>
-<p>You can run the ANALYZE TABLE statement to calculate statistics for tables,
columns, and directories with Parquet data; however, Drill will not use the
statistics for query planning unless you enable the
<code>planner.statistics.use</code> option, as shown:</p>
+<p>Drill will not use the statistics for query planning unless you enable the
<code>planner.statistics.use</code> option, as shown:</p>
<div class="highlight"><pre><code class="language-text" data-lang="text">SET
`planner.statistics.use` = true;
</code></pre></div>
-<p>Alternatively, you can enable the option in the Drill Web UI at
<code>http://<drill-hostname-or-ip>:8047/options</code>.</p>
+<p>Alternatively, you can enable the option in the Drill Web UI at
<code>http://<drill-hostname-or-ip-address>:8047/options</code>.</p>
<h2 id="syntax">Syntax</h2>
@@ -1360,13 +1361,13 @@ An integer that specifies the percentage of data on
which to compute statistics.
<h2 id="related-command">Related Command</h2>
-<p>If you drop a table on which you have run ANALYZE TABLE, the statistics are
automatically removed with the table: </p>
+<p>If you drop a table that you have already run ANALYZE TABLE against, the
statistics are automatically removed with the table: </p>
<div class="highlight"><pre><code class="language-text" data-lang="text">DROP
TABLE [IF EXISTS] [workspace.]name
</code></pre></div>
-<p>If you want to remove statistics for a table (and keep the table), you must
remove the directory in which Drill stores the statistics: </p>
+<p>To remove statistics for a table you want to keep, you must remove the
directory in which Drill stores the statistics: </p>
<div class="highlight"><pre><code class="language-text" data-lang="text">DROP
TABLE [IF EXISTS] [workspace.]name/.stats.drill
</code></pre></div>
-<p>If you have already issued the ANALYZE TABLE statement against specific
columns, a table, or directory, you must run the DROP TABLE statement with
<code>/.stats.drill</code> before you can successfully run the ANALYZE TABLE
statement against the data source again, for example:</p>
+<p>If you have already issued the ANALYZE TABLE statement against specific
columns, table, or directory, you must run the DROP TABLE statement with
<code>/.stats.drill</code> before you can successfully run the ANALYZE TABLE
statement against the data source again, for example:</p>
<div class="highlight"><pre><code class="language-text" data-lang="text">DROP
TABLE `table_stats/Tpch0.01/parquet/customer/.stats.drill`;
</code></pre></div>
<p>Note that <code>/.stats.drill</code> is the directory to which the JSON
file with statistics is written. </p>
@@ -1466,11 +1467,10 @@ Sample
<p>Next, consider the range predicate <code>"WHERE a > 5 AND a <=
16"</code>. The range spans part of bucket [1, 7] and entire buckets [8,
9], [10, 11] and [12, 16]. The total estimate = (7-5)/7 * 16 + 16 + 16 + 16 =
53 (approximately). The actual count is 59.</p>
-<p><strong>Viewing Histogram Statistics for a Column</strong>
+<p><strong>Viewing Histogram Statistics for a Column</strong><br>
Histogram statistics are generated for each column, as shown: </p>
-
-<p>qhistogram":{"category":"numeric-equi-depth","numRowsPerBucket":150,"buckets":[0.0,2.0,4.0,7.0,9.0,12.0,15.199999999999978,17.0,19.0,22.0,24.0]</p>
-
+<div class="highlight"><pre><code class="language-text"
data-lang="text">qhistogram":{"category":"numeric-equi-depth","numRowsPerBucket":150,"buckets":[0.0,2.0,4.0,7.0,9.0,12.0,15.199999999999978,17.0,19.0,22.0,24.0]
+</code></pre></div>
<p>In this example, there are 10 buckets. Each bucket contains 150 rows, which
is calculated as the number of rows (1500)/number of buckets (10). The list of
numbers for the “buckets” property indicates bucket boundaries, with the first
bucket starting at 0.0 and ending at 2.0. The end of the first bucket is the
start point for the second bucket, such that the second bucket starts at 2.0
and ends at 4.0, and so on for the remainder of the buckets. </p>
<h2 id="limitations">Limitations</h2>
diff --git a/docs/create-or-replace-schema/index.html
b/docs/create-or-replace-schema/index.html
index 0fa00fb..3e921fc 100644
--- a/docs/create-or-replace-schema/index.html
+++ b/docs/create-or-replace-schema/index.html
@@ -1320,7 +1320,7 @@
</div>
- May 2, 2019
+ May 31, 2019
<link href="/css/docpage.css" rel="stylesheet" type="text/css">
@@ -1958,15 +1958,16 @@ drop schema for table `text_table`;
</code></pre></div>
<h2 id="troubleshooting">Troubleshooting</h2>
-<p><strong>Schema defined as incorrect data type produces
DATA_READ_ERROR</strong><br>
-Assume that you defined schema on the “name” column as integer, as shown:
- create or replace schema (name int) for table
dfs.tmp.<code>text_table</code>;
- +------+-----------------------------------------+
- | ok | summary |
- +------+-----------------------------------------+
- | true | Created schema for [dfs.tmp.text_table] |
- +------+-----------------------------------------+</p>
+<p><strong>Schema defined as incorrect data type produces
DATA_READ_ERROR</strong> </p>
+<p>Assume that you defined schema on the “name” column as integer, as shown:
</p>
+<div class="highlight"><pre><code class="language-text"
data-lang="text">create or replace schema (name int) for table
dfs.tmp.`text_table`;
++------+-----------------------------------------+
+| ok | summary |
++------+-----------------------------------------+
+| true | Created schema for [dfs.tmp.text_table] |
++------+-----------------------------------------+
+</code></pre></div>
<p>Because the column does not contain integers, a query issued against the
directory returns the DATA_READ_ERROR. The error message includes the line and
value causing the issue: </p>
<div class="highlight"><pre><code class="language-text"
data-lang="text">select * from dfs.tmp.`text_table`;
diff --git a/docs/optimizing-parquet-metadata-reading/index.html
b/docs/optimizing-parquet-metadata-reading/index.html
index b8a7683..8859b44 100644
--- a/docs/optimizing-parquet-metadata-reading/index.html
+++ b/docs/optimizing-parquet-metadata-reading/index.html
@@ -1318,14 +1318,14 @@
</div>
- Aug 10, 2017
+ May 31, 2019
<link href="/css/docpage.css" rel="stylesheet" type="text/css">
<div class="int_text" align="left">
<p>Parquet metadata caching is a feature that enables Drill to read a
single metadata cache file instead of retrieving metadata from multiple Parquet
files during the query-planning phase.
-Parquet metadata caching is available for Parquet data in Drill 1.2 and later.
To enable Parquet metadata caching, issue the REFRESH TABLE METADATA <path to
table> command. When you run this command Drill generates a metadata cache
file. </p>
+Parquet metadata caching is available for Parquet data in Drill 1.2 and later.
To enable Parquet metadata caching, issue the <a
href="/docs/refresh-table-metadata/">REFRESH TABLE METADATA</a> <path to table>
command. When you run this command Drill generates a metadata cache file. </p>
<div class="admonition note">
<p class="first admonition-title">Note</p>
diff --git a/docs/refresh-table-metadata/index.html
b/docs/refresh-table-metadata/index.html
index 2031eb7..f597805 100644
--- a/docs/refresh-table-metadata/index.html
+++ b/docs/refresh-table-metadata/index.html
@@ -1320,7 +1320,7 @@
</div>
- May 24, 2019
+ May 31, 2019
<link href="/css/docpage.css" rel="stylesheet" type="text/css">
@@ -1433,7 +1433,7 @@ nation.parquet
</code></pre></div>
<p><strong>Note:</strong> You can access the sample
<code>nation.parquet</code> file in the <code>sample-data</code> directory of
your Drill installation.</p>
-<p>Change schemas to switch to <code>dfs.samples</code>: </p>
+<p>Change to the <code>dfs.samples</code> schema: </p>
<div class="highlight"><pre><code class="language-text" data-lang="text">use
dfs.samples;
+-------+------------------------------------------+
| ok | summary |
diff --git a/docs/running-drill-on-docker/index.html
b/docs/running-drill-on-docker/index.html
index f07b082..6d7bff8 100644
--- a/docs/running-drill-on-docker/index.html
+++ b/docs/running-drill-on-docker/index.html
@@ -1320,7 +1320,7 @@
</div>
- May 24, 2019
+ May 31, 2019
<link href="/css/docpage.css" rel="stylesheet" type="text/css">
@@ -1332,21 +1332,13 @@
<h2 id="prerequisite">Prerequisite</h2>
-<p>You must have the Docker client (version 18 or later) installed on your
machine. </p>
-
-<ul>
-<li><a href="https://www.docker.com/docker-mac">Docker for Mac</a><br></li>
-<li><a href="https://www.docker.com/docker-windows">Docker for
Windows</a><br></li>
-<li><a href="https://www.docker.com/docker-oracle-linux">Docker for Oracle
Linux</a><br></li>
-</ul>
+<p>You must have the Docker client (version 18 or later) <a
href="https://docs.docker.com/install/">installed on your machine</a>. </p>
<h2 id="running-drill-in-a-docker-container">Running Drill in a Docker
Container</h2>
-<p>You can start and run a Docker container in “detached” mode or “foreground”
mode. Foreground is the default mode. Foreground mode runs the Drill process in
the container and attaches the console to Drill’s standard input, output, and
standard error. Detached mode runs the container in the background.</p>
-
-<p>Whether you run the Docker container in detached or foreground mode, you
start Drill in a container by issuing the docker run command with some options.
</p>
+<p>You can start and run a Docker container in detached mode or foreground
mode. <a
href="/docs/running-drill-on-docker/#running-the-drill-docker-container-in-detached-mode">Detached
mode</a> runs the container in the background. Foreground is the default mode.
<a
href="/docs/running-drill-on-docker/#running-the-drill-docker-container-in-foreground-mode">Foreground
mode</a> runs the Drill process in the container and attaches the console to
Drill’s standard input, output, and standard er [...]
-<p>The following table describes the options: </p>
+<p>Whether you run the Docker container in detached or foreground mode, you
start Drill in a container by issuing the docker <code>run</code> command with
some options, as described in the following table: </p>
<table><thead>
<tr>
@@ -1399,7 +1391,7 @@
<p>Open a terminal window (Command Prompt or PowerShell, but not PowerShell
ISE) and then issue the following commands and options to connect to SQLLine
(the Drill shell): </p>
-<p><strong>Note:</strong> When you run the Drill Docker container in Detached
mode, you connect to SQLLine (the Drill shell) using drill-localhost. </p>
+<p><strong>Note:</strong> When you run the Drill Docker container in detached
mode, you connect to SQLLine (the Drill shell) using drill-localhost. </p>
<div class="highlight"><pre><code class="language-text" data-lang="text"> $
docker run -i --name drill-1.16.0 -p 8047:8047 --detach -t
drill/apache-drill:1.16.0 /bin/bash
<displays container ID>
diff --git a/feed.xml b/feed.xml
index 1cfdec8..75f966a 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
</description>
<link>/</link>
<atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
- <pubDate>Fri, 24 May 2019 15:54:55 -0700</pubDate>
- <lastBuildDate>Fri, 24 May 2019 15:54:55 -0700</lastBuildDate>
+ <pubDate>Thu, 30 May 2019 20:08:33 -0700</pubDate>
+ <lastBuildDate>Thu, 30 May 2019 20:08:33 -0700</lastBuildDate>
<generator>Jekyll v2.5.2</generator>
<item>