drill-site git commit: dev edits updated in docs

bridgetb Tue, 26 May 2015 17:24:07 -0700

Repository: drill-site
Updated Branches:
  refs/heads/asf-site f3d728580 -> 1c6fa3f91



dev edits updated in docs


Project: http://git-wip-us.apache.org/repos/asf/drill-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill-site/commit/1c6fa3f9
Tree: http://git-wip-us.apache.org/repos/asf/drill-site/tree/1c6fa3f9
Diff: http://git-wip-us.apache.org/repos/asf/drill-site/diff/1c6fa3f9

Branch: refs/heads/asf-site
Commit: 1c6fa3f918a462c81a956b4ea99dca62c2a6487e
Parents: f3d7285
Author: Bridget Bevens <[email protected]>
Authored: Tue May 26 17:23:02 2015 -0700
Committer: Bridget Bevens <[email protected]>
Committed: Tue May 26 17:23:02 2015 -0700

----------------------------------------------------------------------
 .../index.html                                  |   4 +-
 docs/configuring-drill-memory/index.html        |   3 +
 .../index.html                                  |   3 +
 .../index.html                                  |   2 +-
 docs/drill-in-10-minutes/index.html             |   2 +-
 docs/json-data-model/index.html                 | 106 +++++++----------
 .../index.html                                  |   4 +-
 docs/querying-directories/index.html            |  12 +-
 docs/querying-plain-text-files/index.html       | 114 ++++++++++++++++++-
 docs/supported-data-types/index.html            |   6 +-
 feed.xml                                        |   4 +-
 11 files changed, 173 insertions(+), 87 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/configuration-options-introduction/index.html
----------------------------------------------------------------------
diff --git a/docs/configuration-options-introduction/index.html 
b/docs/configuration-options-introduction/index.html
index 3055fe5..96fc390 100644
--- a/docs/configuration-options-introduction/index.html
+++ b/docs/configuration-options-introduction/index.html
@@ -1035,7 +1035,7 @@ Drill sources the local 
<code>&lt;drill_installation_directory&gt;/conf</code> d
 <tr>
 <td>exec.queue.enable</td>
 <td>FALSE</td>
-<td>Changes the state of query queues to control the number of queries that 
run simultaneously.</td>
+<td>Changes the state of query queues. False allows unlimited concurrent 
queries.</td>
 </tr>
 <tr>
 <td>exec.queue.large</td>
@@ -1050,7 +1050,7 @@ Drill sources the local 
<code>&lt;drill_installation_directory&gt;/conf</code> d
 <tr>
 <td>exec.queue.threshold</td>
 <td>30000000</td>
-<td>Sets the cost threshold, which depends on the complexity of the queries in 
queue, for determining whether query is large or small. Complex queries have 
higher thresholds. Range: 0-9223372036854775807</td>
+<td>Sets the cost threshold for determining whether query is large or small 
based on complexity. Complex queries have higher thresholds. By default, an 
estimated 30,000,000 rows will be processed by a query. Range: 
0-9223372036854775807</td>
 </tr>
 <tr>
 <td>exec.queue.timeout_millis</td>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/configuring-drill-memory/index.html
----------------------------------------------------------------------
diff --git a/docs/configuring-drill-memory/index.html 
b/docs/configuring-drill-memory/index.html
index e8c624e..8f23e8d 100644
--- a/docs/configuring-drill-memory/index.html
+++ b/docs/configuring-drill-memory/index.html
@@ -1009,6 +1009,9 @@ export DRILL_JAVA_OPTS=&quot;-Xms1G -Xmx$DRILL_MAX_HEAP 
-XX:MaxDirectMemorySize=
 <li>Xms specifies the initial memory allocation pool.</li>
 </ul>
 
+<p>If performance is an issue, replace the -ea flag with -Dbounds=false, as 
shown in the following example:</p>
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">export DRILL_JAVA_OPTS=&quot;-Xms1G -Xmx$DRILL_MAX_HEAP 
-XX:MaxDirectMemorySize=$DRILL_MAX_DIRECT_MEMORY -XX:MaxPermSize=512M 
-XX:ReservedCodeCacheSize=1G -Dbounds=false&quot;
+</code></pre></div>
     
       
         <div class="doc-nav">

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/configuring-resources-for-a-shared-drillbit/index.html
----------------------------------------------------------------------
diff --git a/docs/configuring-resources-for-a-shared-drillbit/index.html 
b/docs/configuring-resources-for-a-shared-drillbit/index.html
index d5288e5..ba1c804 100644
--- a/docs/configuring-resources-for-a-shared-drillbit/index.html
+++ b/docs/configuring-resources-for-a-shared-drillbit/index.html
@@ -981,8 +981,11 @@
 <ul>
 <li>exec.queue.large<br></li>
 <li>exec.queue.small<br></li>
+<li>exec.queue.threshold</li>
 </ul>
 
+<p>The exec.queue.threshold sets the cost threshold for determining whether 
query is large or small based on complexity. Complex queries have higher 
thresholds. The default, 30,000,000, represents the estimated rows that a query 
will process. To serialize incoming queries, set the small queue at 0 and the 
threshold at 0.</p>
+
 <p>For more information, see the section, <a 
href="/docs/performance-tuning-introduction/">&quot;Performance 
Tuning&quot;</a>.</p>
 
 <h2 id="configuring-parallelization">Configuring Parallelization</h2>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/date-time-functions-and-arithmetic/index.html
----------------------------------------------------------------------
diff --git a/docs/date-time-functions-and-arithmetic/index.html 
b/docs/date-time-functions-and-arithmetic/index.html
index 0aa592f..f25387c 100644
--- a/docs/date-time-functions-and-arithmetic/index.html
+++ b/docs/date-time-functions-and-arithmetic/index.html
@@ -1056,7 +1056,7 @@
 +------------+
 1 row selected (0.064 seconds)
 </code></pre></div>
-<p>Find the interval between midnight today, May 21, 2015, and hire dates of 
employees 578 and 761 in the employees.json file included with the Drill 
installation.</p>
+<p>Find the interval between midnight today, May 21, 2015, and hire dates of 
employees 578 and 761 in the <code>employees.json</code> file included with the 
Drill installation.</p>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT AGE(CAST(hire_date AS TIMESTAMP)) FROM 
cp.`employee.json` where employee_id IN( &#39;578&#39;,&#39;761&#39;);
 +------------------+
 |      EXPR$0      |

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/drill-in-10-minutes/index.html
----------------------------------------------------------------------
diff --git a/docs/drill-in-10-minutes/index.html 
b/docs/drill-in-10-minutes/index.html
index 93681b3..7be2ab4 100644
--- a/docs/drill-in-10-minutes/index.html
+++ b/docs/drill-in-10-minutes/index.html
@@ -1015,7 +1015,7 @@ Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed 
mode)
 
 <ol>
 <li><p>In a terminal windows, change to the directory where you want to 
install Drill.</p></li>
-<li><p>To download the latest version of Apache Drill, download Drill from the 
<a href="http://getdrill.org/drill/download/apache-drill-1.0.0.tar.gz";>Drill 
web site</a>or run one of the following commands, depending on which you have 
installed on your system:</p></li>
+<li><p>To download the latest version of Apache Drill, download Drill from the 
<a href="http://getdrill.org/drill/download/apache-drill-1.0.0.tar.gz";>Drill 
web site</a> or run one of the following commands, depending on which you have 
installed on your system:</p></li>
 </ol>
 
 <ul>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/json-data-model/index.html
----------------------------------------------------------------------
diff --git a/docs/json-data-model/index.html b/docs/json-data-model/index.html
index fba84b1..1f99d2f 100644
--- a/docs/json-data-model/index.html
+++ b/docs/json-data-model/index.html
@@ -1134,7 +1134,9 @@ SELECT my column from dfs.`&lt;path_file_name&gt;`;
 
 <h2 id="example:-flatten-and-generate-key-values-for-complex-json">Example: 
Flatten and Generate Key Values for Complex JSON</h2>
 
-<p>This example uses the following data that represents unit sales of tickets 
to events that were sold over a period of for several days in different 
states:</p>
+<p>This example uses the following data that represents unit sales of tickets 
to events that were sold over a period of for several days in December:</p>
+
+<h3 id="ticket_sales.json-contents">ticket_sales.json Contents</h3>
 <div class="highlight"><pre><code class="language-text" data-lang="text">{
   &quot;type&quot;: &quot;ticket&quot;,
   &quot;venue&quot;: 123455,
@@ -1157,55 +1159,31 @@ SELECT my column from dfs.`&lt;path_file_name&gt;`;
 }
 </code></pre></div>
 <p>Take a look at the data in Drill:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT * FROM dfs.`/Users/drilluser/ticket_sales.json`;
-+------------+------------+------------+------------+------------+
-|    type    |  channel   |   month    |    day     |   sales    |
-+------------+------------+------------+------------+------------+
-| ticket     | 123455     | 12         | 
[&quot;15&quot;,&quot;25&quot;,&quot;28&quot;,&quot;31&quot;] | 
{&quot;NY&quot;:&quot;532806&quot;,&quot;PA&quot;:&quot;112889&quot;,&quot;TX&quot;:&quot;898999&quot;,&quot;UT&quot;:&quot;10875&quot;}
 |
-| ticket     | 123456     | 12         | 
[&quot;10&quot;,&quot;15&quot;,&quot;19&quot;,&quot;31&quot;] | 
{&quot;NY&quot;:&quot;972880&quot;,&quot;PA&quot;:&quot;857475&quot;,&quot;CA&quot;:&quot;87350&quot;,&quot;OR&quot;:&quot;49999&quot;}
 |
-+------------+------------+------------+------------+------------+
-2 rows selected (0.041 seconds)
-</code></pre></div>
-<h3 id="flatten-arrays">Flatten Arrays</h3>
-
-<p>The FLATTEN function breaks the following _day arrays from the JSON example 
file shown earlier into separate rows.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">&quot;_day&quot;: [ 15, 25, 28, 31 ] 
-&quot;_day&quot;: [ 10, 15, 19, 31 ]
-</code></pre></div>
-<p>Flatten the sales column of the ticket data onto separate rows, one row for 
each day in the array, for a better view of the data. FLATTEN copies the sales 
data related in the JSON object on each row.  Using the all (*) wildcard as the 
argument to flatten is not supported and returns an error.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT flatten(tkt._day) AS `day`, tkt.sales FROM 
dfs.`/Users/drilluser/ticket_sales.json` tkt;
-
-+------------+------------+
-|    day     |   sales    |
-+------------+------------+
-| 15         | 
{&quot;NY&quot;:532806,&quot;PA&quot;:112889,&quot;TX&quot;:898999,&quot;UT&quot;:10875}
 |
-| 25         | 
{&quot;NY&quot;:532806,&quot;PA&quot;:112889,&quot;TX&quot;:898999,&quot;UT&quot;:10875}
 |
-| 28         | 
{&quot;NY&quot;:532806,&quot;PA&quot;:112889,&quot;TX&quot;:898999,&quot;UT&quot;:10875}
 |
-| 31         | 
{&quot;NY&quot;:532806,&quot;PA&quot;:112889,&quot;TX&quot;:898999,&quot;UT&quot;:10875}
 |
-| 10         | 
{&quot;NY&quot;:972880,&quot;PA&quot;:857475,&quot;CA&quot;:87350,&quot;OR&quot;:49999}
 |
-| 15         | 
{&quot;NY&quot;:972880,&quot;PA&quot;:857475,&quot;CA&quot;:87350,&quot;OR&quot;:49999}
 |
-| 19         | 
{&quot;NY&quot;:972880,&quot;PA&quot;:857475,&quot;CA&quot;:87350,&quot;OR&quot;:49999}
 |
-| 31         | 
{&quot;NY&quot;:972880,&quot;PA&quot;:857475,&quot;CA&quot;:87350,&quot;OR&quot;:49999}
 |
-+------------+------------+
-8 rows selected (0.072 seconds)
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">+---------+---------+---------------------------------------------------------------+
+|  type   |  venue  |                             sales                        
     |
++---------+---------+---------------------------------------------------------------+
+| ticket  | 123455  | 
{&quot;12-10&quot;:532806,&quot;12-11&quot;:112889,&quot;12-19&quot;:898999,&quot;12-21&quot;:10875}
  |
+| ticket  | 123456  | 
{&quot;12-10&quot;:87350,&quot;12-19&quot;:49999,&quot;12-21&quot;:857475,&quot;12-15&quot;:972880}
   |
++---------+---------+---------------------------------------------------------------+
+2 rows selected (1.343 seconds)
 </code></pre></div>
 <h3 id="generate-key/value-pairs">Generate Key/Value Pairs</h3>
 
-<p>Use the KVGEN (Key Value Generator) function to generate key/value pairs 
from complex data. Generating key/value pairs is often helpful when working 
with data that contains arbitrary maps consisting of dynamic and unknown 
element names, such as the ticket sales data by state. For example purposes, 
take a look at how kvgen breaks the sales data into keys and values 
representing the states and number of tickets sold:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT KVGEN(tkt.sales) AS state_sales FROM 
dfs.`/Users/drilluser/ticket_sales.json` tkt;
-+-------------+
-| state_sales |
-+-------------+
-| 
[{&quot;key&quot;:&quot;NY&quot;,&quot;value&quot;:532806},{&quot;key&quot;:&quot;PA&quot;,&quot;value&quot;:112889},{&quot;key&quot;:&quot;TX&quot;,&quot;value&quot;:898999},{&quot;key&quot;:&quot;UT&quot;,&quot;value&quot;:10875}]
 |
-| 
[{&quot;key&quot;:&quot;NY&quot;,&quot;value&quot;:972880},{&quot;key&quot;:&quot;PA&quot;,&quot;value&quot;:857475},{&quot;key&quot;:&quot;CA&quot;,&quot;value&quot;:87350},{&quot;key&quot;:&quot;OR&quot;,&quot;value&quot;:49999}]
 |
-+-------------+
-2 rows selected (0.039 seconds)
+<p>Continuing with the data from <a 
href="/docs/json-data-model/#example:-flatten-and-generate-key-values-for-complex-json">previous
 example</a>, use the KVGEN (Key Value Generator) function to generate 
key/value pairs from complex data. Generating key/value pairs is often helpful 
when working with data that contains arbitrary maps consisting of dynamic and 
unknown element names, such as the ticket sales data in this example. For 
example purposes, take a look at how kvgen breaks the sales data into keys and 
values representing the key dates and number of tickets sold:</p>
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT KVGEN(tkt.sales) AS `key dates:tickets sold` FROM 
dfs.`/Users/drilluser/ticket_sales.json` tkt;
++---------------------------------------------------------------------------------------------------------------------------------------+
+|                                                        key dates:tickets 
sold                                                         |
++---------------------------------------------------------------------------------------------------------------------------------------+
+| 
[{&quot;key&quot;:&quot;12-10&quot;,&quot;value&quot;:&quot;532806&quot;},{&quot;key&quot;:&quot;12-11&quot;,&quot;value&quot;:&quot;112889&quot;},{&quot;key&quot;:&quot;12-19&quot;,&quot;value&quot;:&quot;898999&quot;},{&quot;key&quot;:&quot;12-21&quot;,&quot;value&quot;:&quot;10875&quot;}]
 |
+| 
[{&quot;key&quot;:&quot;12-10&quot;,&quot;value&quot;:&quot;87350&quot;},{&quot;key&quot;:&quot;12-19&quot;,&quot;value&quot;:&quot;49999&quot;},{&quot;key&quot;:&quot;12-21&quot;,&quot;value&quot;:&quot;857475&quot;},{&quot;key&quot;:&quot;12-15&quot;,&quot;value&quot;:&quot;972880&quot;}]
 |
++---------------------------------------------------------------------------------------------------------------------------------------+
+2 rows selected (0.106 seconds)
 </code></pre></div>
 <p>KVGEN allows queries against maps where the keys themselves represent data 
rather than a schema, as shown in the next example.</p>
 
 <h3 id="flatten-json-data">Flatten JSON Data</h3>
 
-<p>FLATTEN breaks the list of key-value pairs into separate rows on which you 
can apply analytic functions. FLATTEN takes a JSON array, such as the output 
from kvgen(sales), as an argument. Using the all (*) wildcard as the argument 
is not supported and returns an error.</p>
+<p>FLATTEN breaks the list of key-value pairs into separate rows on which you 
can apply analytic functions. FLATTEN takes a JSON array, such as the output 
from kvgen(sales), as an argument. Using the all (*) wildcard as the argument 
is not supported and returns an error. The following example continues using 
data from the <a 
href="/docs/json-data-model/#example:-flatten-and-generate-key-values-for-complex-json">previous
 example</a>:</p>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT FLATTEN(kvgen(sales)) Sales 
 FROM dfs.`/Users/drilluser/drill/ticket_sales.json`;
 
@@ -1225,39 +1203,39 @@ FROM dfs.`/Users/drilluser/drill/ticket_sales.json`;
 </code></pre></div>
 <h3 id="example:-aggregate-loosely-structured-data">Example: Aggregate Loosely 
Structured Data</h3>
 
-<p>Use flatten and kvgen together to aggregate the data. Continuing with the 
previous example, make sure all text mode is set to false to sum numbers. Drill 
returns an error if you attempt to sum data in all text mode. </p>
+<p>Use flatten and kvgen together to aggregate the data from the <a 
href="/docs/json-data-model/#example:-flatten-and-generate-key-values-for-complex-json">previous
 example</a>. Make sure all text mode is set to false to sum numbers. Drill 
returns an error if you attempt to sum data in all text mode. </p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">ALTER 
SYSTEM SET `store.json.all_text_mode` = false;
 </code></pre></div>
 <p>Sum the ticket sales by combining the <code>SUM</code>, 
<code>FLATTEN</code>, and <code>KVGEN</code> functions in a single query.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT SUM(tkt.tot_sales.`value`) AS TotalSales FROM (SELECT 
flatten(kvgen(sales)) tot_sales FROM dfs.`/Users/drilluser/ticket_sales.json`) 
tkt;
-
-+------------+
-| TotalSales |
-+------------+
-| 3523273    |
-+------------+
-1 row selected (0.081 seconds)
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT SUM(tkt.tot_sales.`value`) AS TicketSold FROM (SELECT 
flatten(kvgen(sales)) tot_sales FROM dfs.`/Users/drilluser/ticket_sales.json`) 
tkt;
+
++--------------+
+| TicketsSold  |
++--------------+
+| 3523273.0    |
++--------------+
+1 row selected (0.244 seconds)
 </code></pre></div>
 <h3 id="example:-aggregate-and-sort-data">Example: Aggregate and Sort Data</h3>
 
-<p>Sum the ticket sales by state and group by state and sort in ascending 
order. </p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT `right`(tkt.tot_sales.key,2) State, 
+<p>Sum the ticket sales by state and group by day and sort in ascending order. 
</p>
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT `right`(tkt.tot_sales.key,2) `December Date`, 
 SUM(tkt.tot_sales.`value`) AS TotalSales 
-FROM (SELECT flatten(kvgen(sales)) tot_sales 
+FROM (SELECT FLATTEN(kvgen(sales)) tot_sales 
 FROM dfs.`/Users/drilluser/ticket_sales.json`) tkt 
 GROUP BY `right`(tkt.tot_sales.key,2) 
 ORDER BY TotalSales;
 
-+---------------+--------------+
-| December_Date |  TotalSales  |
-+---------------+--------------+
-| 11            | 112889       |
-| 10            | 620156       |
-| 21            | 868350       |
-| 19            | 948998       |
-| 15            | 972880       |
-+---------------+--------------+
-5 rows selected (0.203 seconds)
++----------------+-------------+
+| December Date  | TotalSales  |
++----------------+-------------+
+| 11             | 112889.0    |
+| 10             | 620156.0    |
+| 21             | 868350.0    |
+| 19             | 948998.0    |
+| 15             | 972880.0    |
++----------------+-------------+
+5 rows selected (0.252 seconds)
 </code></pre></div>
 <h3 id="example:-access-a-map-field-in-an-array">Example: Access a Map Field 
in an Array</h3>
 

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/plugin-configuration-introduction/index.html
----------------------------------------------------------------------
diff --git a/docs/plugin-configuration-introduction/index.html 
b/docs/plugin-configuration-introduction/index.html
index 8681db3..fb67e0c 100644
--- a/docs/plugin-configuration-introduction/index.html
+++ b/docs/plugin-configuration-introduction/index.html
@@ -1033,9 +1033,9 @@ name. Names are case-sensitive.</li>
   </tr>
   <tr>
     <td>&quot;workspaces&quot;. . . &quot;location&quot;</td>
-    <td>&quot;location&quot;: &quot;/&quot;<br>&quot;location&quot;: 
&quot;/tmp&quot;</td>
+    <td>&quot;location&quot;: 
&quot;/Users/johndoe/mydata&quot;<br>&quot;location&quot;: &quot;/tmp&quot;</td>
     <td>no</td>
-    <td>Path to a directory on the file system.</td>
+    <td>Full path to a directory on the file system.</td>
   </tr>
   <tr>
     <td>&quot;workspaces&quot;. . . &quot;writable&quot;</td>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/querying-directories/index.html
----------------------------------------------------------------------
diff --git a/docs/querying-directories/index.html 
b/docs/querying-directories/index.html
index 411112b..5117204 100644
--- a/docs/querying-directories/index.html
+++ b/docs/querying-directories/index.html
@@ -982,8 +982,8 @@ also query directories of JSON files, for example.</p>
 same structure: <code>plays.csv</code> and <code>moreplays.csv</code>. The 
first file contains 7
 records and the second file contains 3 records. The following query returns
 the &quot;union&quot; of the two files, ordered by the first column:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">0: 
jdbc:drill:zk=local&gt; select columns[0] as `Year`, columns[1] as Play 
-from dfs.`/Users/brumsby/drill/testdata` order by 1;
+<div class="highlight"><pre><code class="language-text" data-lang="text">0: 
jdbc:drill:zk=local&gt; SELECT COLUMNS[0] AS `Year`, COLUMNS[1] AS Play 
+FROM dfs.`/Users/brumsby/drill/testdata` order by 1;
 
 +------------+------------------------+
 |    Year    |          Play          |
@@ -1016,7 +1016,7 @@ directories contain JSON files.</p>
 <p>You can query all of these files, or a subset, by referencing the file 
system
 once in a Drill query. For example, the following query counts the number of
 records in all of the files inside the <code>2013</code> directory:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">0: 
jdbc:drill:&gt; select count(*) from 
MFS.`/mapr/drilldemo/labs/clicks/logs/2013` ;
+<div class="highlight"><pre><code class="language-text" data-lang="text">0: 
jdbc:drill:&gt; SELECT COUNT(*) FROM 
MFS.`/mapr/drilldemo/labs/clicks/logs/2013` ;
 +------------+
 |   EXPR$0   |
 +------------+
@@ -1030,7 +1030,7 @@ is a workspace that points to the <code>logs</code> 
directory, which contains mu
 subdirectories: <code>2012</code>, <code>2013</code>, and <code>2014</code>. 
The following query constrains
 files inside the subdirectory named <code>2013</code>. The variable 
<code>dir0</code> refers to the
 first level down from logs, <code>dir1</code> to the next level, and so on.</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">0: 
jdbc:drill:&gt; use bob.logdata;
+<div class="highlight"><pre><code class="language-text" data-lang="text">0: 
jdbc:drill:&gt; USE bob.logdata;
 +------------+-----------------------------------------+
 |     ok     |              summary                    |
 +------------+-----------------------------------------+
@@ -1038,7 +1038,7 @@ first level down from logs, <code>dir1</code> to the next 
level, and so on.</p>
 +------------+-----------------------------------------+
 1 row selected (0.305 seconds)
 
-0: jdbc:drill:&gt; select * from logs where dir0=&#39;2013&#39; limit 10;
+0: jdbc:drill:&gt; SELECT * FROM logs WHERE dir0=&#39;2013&#39; LIMIT 10;
 
+------------+------------+------------+------------+------------+------------+------------+------------+------------+-------------+
 |    dir0    |    dir1    |  trans_id  |    date    |    time    |  cust_id   
|   device   |   state    |  camp_id   |  keywords   |
 
+------------+------------+------------+------------+------------+------------+------------+------------+------------+-------------+
@@ -1055,8 +1055,6 @@ first level down from logs, <code>dir1</code> to the next 
level, and so on.</p>
 
+------------+------------+------------+------------+------------+------------+------------+------------+------------+-------------+
 10 rows selected (0.583 seconds)
 </code></pre></div>
-<p>For more information about querying directories, see the section, <a 
href="/docs/query-directory-functions">&quot;Query Directory 
Functions&quot;</a>.</p>
-
     
       
         <div class="doc-nav">

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/querying-plain-text-files/index.html
----------------------------------------------------------------------
diff --git a/docs/querying-plain-text-files/index.html 
b/docs/querying-plain-text-files/index.html
index bcd19a5..ed5ffa5 100644
--- a/docs/querying-plain-text-files/index.html
+++ b/docs/querying-plain-text-files/index.html
@@ -972,9 +972,8 @@
 
     <div class="int_text" align="left">
       
-        <p>You can use Drill to access both structured file types and plain 
text files
-(flat files). This section shows a few simple examples that work on flat
-files:</p>
+        <p>You can use Drill to access structured file types and plain text 
files
+(flat files), such as the following file types:</p>
 
 <ul>
 <li>CSV files (comma-separated values)</li>
@@ -982,8 +981,15 @@ files:</p>
 <li>PSV files (pipe-separated values)</li>
 </ul>
 
-<p>The examples here show CSV files, but queries against TSV and PSV files 
return
-equivalent results. However, make sure that your registered storage plugins
+<p>Follow these general guidelines for querying a plain text file:</p>
+
+<ul>
+<li>Use a storage plugin that defines the file format, such as comma-separated 
(CSV) or tab-separated values (TSV), of the data in the plain text file.</li>
+<li>In the SELECT statement, use the <code>COLUMNS[n]</code> syntax in lieu of 
column names, which do not exist in a plain text file. The first column is 
column <code>0</code>.</li>
+<li>In the FROM clause, use the path to the plain text file instead of using a 
table name. Enclose the path and file name in back ticks. </li>
+</ul>
+
+<p>Make sure that your registered storage plugins
 recognize the appropriate file types and extensions. For example, the
 following configuration expects PSV files (files with a pipe delimiter) to
 have a <code>tbl</code> extension, not a <code>psv</code> extension. Drill 
returns a &quot;file not
@@ -1083,6 +1089,104 @@ from dfs.`/Users/brumsby/drill/plays.csv` where 
columns[0]&gt;1599;
 <p>Note that the restriction with the use of aliases applies to queries against
 all data sources.</p>
 
+<h2 id="example-of-querying-a-tsv-file">Example of Querying a TSV File</h2>
+
+<p>This example uses a tab-separated value (TSV) file that you download from a
+Google internet site. The data in the file consists of phrases from books that
+Google scans and generates for its <a 
href="http://storage.googleapis.com/books/ngrams/books/datasetsv2.html";>Google 
Books Ngram
+Viewer</a>. You
+use the data to find the relative frequencies of Ngrams. </p>
+
+<h3 id="about-the-data">About the Data</h3>
+
+<p>Each line in the TSV file has the following structure:</p>
+
+<p><code>ngram TAB year TAB match_count TAB volume_count NEWLINE</code></p>
+
+<p>For example, lines 1722089 and 1722090 in the file contain this data:</p>
+
+<table ><tbody><tr><th >ngram</th><th >year</th><th colspan="1" 
>match_count</th><th >volume_count</th></tr><tr><td ><p class="p1">Zoological 
Journal of the Linnean</p></td><td >2007</td><td colspan="1" >284</td><td 
>101</td></tr><tr><td colspan="1" ><p class="p1">Zoological Journal of the 
Linnean</p></td><td colspan="1" >2008</td><td colspan="1" >257</td><td 
colspan="1" >87</td></tr></tbody></table> 
+  
+
+<p>In 2007, &quot;Zoological Journal of the Linnean&quot; occurred 284 times 
overall in 101
+distinct books of the Google sample.</p>
+
+<h3 id="download-and-set-up-the-data">Download and Set Up the Data</h3>
+
+<p>After downloading the file, you use the <code>dfs</code> storage plugin, 
and then select
+data from the file as you would a table. In the SELECT statement, enclose the
+path and name of the file in back ticks.</p>
+
+<ol>
+<li><p>Download the compressed Google Ngram data from this location:  </p>
+
+<p><a 
href="http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-5gram-20120701-zo.gz";>http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-5gram-20120701-zo.gz</a></p></li>
+<li><p>Unzip the file.<br>
+ A file named googlebooks-eng-all-5gram-20120701-zo appears.</p></li>
+<li><p>Change the file name to add a <code>.tsv</code> extension.<br>
+The Drill <code>dfs</code> storage plugin definition includes a TSV format 
that requires
+a file to have this extension. Later, you learn how to skip this step and 
query the GZ file directly.</p></li>
+</ol>
+
+<h3 id="query-the-data">Query the Data</h3>
+
+<p>Get data about &quot;Zoological Journal of the Linnean&quot; that appears 
more than 250
+times a year in the books that Google scans.</p>
+
+<ol>
+<li><p>Switch back to using the <code>dfs</code> storage plugin.</p>
+<div class="highlight"><pre><code class="language-text" data-lang="text">  USE 
dfs;
+</code></pre></div></li>
+<li><p>Issue a SELECT statement to get the first three columns in the file.  
</p>
+
+<ul>
+<li>In the FROM clause of the example, substitute your path to the TSV 
file.<br></li>
+<li>Use aliases to replace the column headers, such as EXPR$0, with 
user-friendly column headers, Ngram, Publication Date, and Frequency.</li>
+<li>In the WHERE clause, enclose the string literal &quot;Zoological Journal 
of the Linnean&quot; in single quotation marks.<br></li>
+<li><p>Limit the output to 10 rows.  </p>
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT COLUMNS[0] AS Ngram,
+       COLUMNS[1] AS Publication_Date,
+       COLUMNS[2] AS Frequency
+FROM `/Users/drilluser/Downloads/googlebooks-eng-all-5gram-20120701-zo.tsv`
+WHERE ((columns[0] = &#39;Zoological Journal of the Linnean&#39;)
+AND (columns[2] &gt; 250)) LIMIT 10;
+</code></pre></div></li>
+</ul>
+
+<p>The output is:</p>
+<div class="highlight"><pre><code class="language-text" data-lang="text"> 
+------------------------------------+-------------------+------------+
+ |               Ngram                | Publication_Date  | Frequency  |
+ +------------------------------------+-------------------+------------+
+ | Zoological Journal of the Linnean  | 1993              | 297        |
+ | Zoological Journal of the Linnean  | 1997              | 255        |
+ | Zoological Journal of the Linnean  | 2003              | 254        |
+ | Zoological Journal of the Linnean  | 2007              | 284        |
+ | Zoological Journal of the Linnean  | 2008              | 257        |
+ +------------------------------------+-------------------+------------+
+ 5 rows selected (1.175 seconds)
+</code></pre></div></li>
+</ol>
+
+<p>The Drill default storage plugins support common file formats. </p>
+
+<h2 id="query-the-gz-file-directly">Query the GZ File Directly</h2>
+
+<p>This example covers how to query the GZ file containing the compressed TSV 
data. The GZ file name needs to be renamed to specify the type of delimited 
file, such as CSV or TSV. You add <code>.tsv</code> before the <code>.gz</code> 
extension in this example.</p>
+
+<ol>
+<li>Rename the GZ file <code>googlebooks-eng-all-5gram-20120701-zo.gz</code> 
to googlebooks-eng-all-5gram-20120701-zo.tsv.gz.</li>
+<li><p>Query the renamed GZ file directly to get data about &quot;Zoological 
Journal of the Linnean&quot; that appears more than 250 times a year in the 
books that Google scans. In the FROM clause, instead of using the full path to 
the file as you did in the last exercise, connect to the data using the storage 
plugin workspace name ngram.</p>
+<div class="highlight"><pre><code class="language-text" data-lang="text"> 
SELECT COLUMNS[0], 
+        COLUMNS[1], 
+        COLUMNS[2] 
+ FROM 
dfs.`/Users/drilluser/Downloads/googlebooks-eng-all-5gram-20120701-zo.tsv.gz` 
+ WHERE ((columns[0] = &#39;Zoological Journal of the Linnean&#39;) 
+ AND (columns[2] &gt; 250)) 
+ LIMIT 10;
+</code></pre></div>
+<p>The 5 rows of output appear.  </p></li>
+</ol>
+
     
       
         <div class="doc-nav">

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/docs/supported-data-types/index.html
----------------------------------------------------------------------
diff --git a/docs/supported-data-types/index.html 
b/docs/supported-data-types/index.html
index 9e7d960..46bcf82 100644
--- a/docs/supported-data-types/index.html
+++ b/docs/supported-data-types/index.html
@@ -1084,11 +1084,11 @@
 
 <p><code>a[1]</code>  </p>
 
-<p>You can refer to the value for a key in a map using this syntax:</p>
+<p>You can refer to the value for a key in a map using dot notation:</p>
 
-<p><code>m[&#39;k&#39;]</code></p>
+<p><code>t.m.k</code></p>
 
-<p>The section <a href="/docs/querying-complex-data-introduction">âQuery 
Complex Dataâ</a> shows how to use <a 
href="/docs/supported-data-types/#composite-types">composite types</a> to 
access nested arrays. <a 
href="/docs/handling-different-data-types/#handling-json-and-parquet-data">&quot;Handling
 Different Data Types&quot;</a> includes examples of JSON maps and arrays. 
Drill provides functions for handling array and map types:</p>
+<p>The section <a href="/docs/querying-complex-data-introduction">âQuery 
Complex Dataâ</a> shows how to use composite types to access nested arrays. 
<a 
href="/docs/handling-different-data-types/#handling-json-and-parquet-data">&quot;Handling
 Different Data Types&quot;</a> includes examples of JSON maps and arrays. 
Drill provides functions for handling array and map types:</p>
 
 <ul>
 <li><a href="/docs/kvgen/">&quot;KVGEN&quot;</a></li>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/1c6fa3f9/feed.xml
----------------------------------------------------------------------
diff --git a/feed.xml b/feed.xml
index 2a70748..65f07b2 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
 </description>
     <link>/</link>
     <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Fri, 22 May 2015 15:08:04 -0700</pubDate>
-    <lastBuildDate>Fri, 22 May 2015 15:08:04 -0700</lastBuildDate>
+    <pubDate>Tue, 26 May 2015 17:17:58 -0700</pubDate>
+    <lastBuildDate>Tue, 26 May 2015 17:17:58 -0700</lastBuildDate>
     <generator>Jekyll v2.5.2</generator>
     
       <item>

drill-site git commit: dev edits updated in docs

Reply via email to