[08/13] drill git commit: DRILL-2315: Confluence conversion plus fixes

bridgetb Wed, 25 Feb 2015 16:31:57 -0800

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/manage/004-partition-prune.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/004-partition-prune.md 
b/_docs/drill-docs/manage/004-partition-prune.md
deleted file mode 100644
index fa81034..0000000
--- a/_docs/drill-docs/manage/004-partition-prune.md
+++ /dev/null
@@ -1,75 +0,0 @@
----
-title: "Partition Pruning"
-parent: "Manage Drill"
----
-Partition pruning is a performance optimization that limits the number of
-files and partitions that Drill reads when querying file systems and Hive
-tables. Drill only reads a subset of the files that reside in a file system or
-a subset of the partitions in a Hive table when a query matches certain filter
-criteria.
-
-For Drill to apply partition pruning to Hive tables, you must have created the
-tables in Hive using the `PARTITION BY` clause:
-
-`CREATE TABLE <table_name> (<column_name>) PARTITION BY (<column_name>);`
-
-When you create Hive tables using the `PARTITION BY` clause, each partition of
-data is automatically split out into different directories as data is written
-to disk. For more information about Hive partitioning, refer to the [Apache
-Hive 
wiki](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL/#LanguageManualDDL-PartitionedTables).
-
-Typically, table data in a file system is organized by directories and
-subdirectories. Queries on table data may contain `WHERE` clause filters on
-specific directories.
-
-Drillâs query planner evaluates the filters as part of a Filter operator. If
-no partition filters are present, the underlying Scan operator reads all files
-in all directories and then sends the data to operators downstream, such as
-Filter.
-
-When partition filters are present, the query planner determines if it can
-push the filters down to the Scan such that the Scan only reads the
-directories that match the partition filters, thus reducing disk I/O.
-
-## Partition Pruning Example
-
-The /`Users/max/data/logs` directory in a file system contains subdirectories
-that span a few years.
-
-The following image shows the hierarchical structure of the `â¦/logs` 
directory
-and (sub) directories:
-
-![](../../img/54.png)
-
-The following query requests log file data for 2013 from the `â¦/logs`
-directory in the file system:
-
-    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 and dir0 = 
2013 limit 2Í¾
-
-If you run the `EXPLAIN PLAN` command for the query, you can see that the`
-â¦/logs` directory is filtered by the scan operator.
-
-    EXPLAIN PLAN FOR SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 
10 and dir0 = 2013 limit 2Í¾
-
-The following image shows a portion of the physical plan when partition
-pruning is applied:
-
-![](../../img/21.png)
-
-## Filter Examples
-
-The following queries include examples of the types of filters eligible for
-partition pruning optimization:
-
-**Example 1: Partition filters ANDed together**
-
-    SELECT * FROM dfs.`/Users/max/data/logs` WHERE dir0 = '2014' AND dir1 = '1'
-
-**Example 2: Partition filter ANDed with regular column filter**
-
-    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 AND dir0 = 
2013 limit 2Í¾
-
-**Example 3: Combination of AND, OR involving partition filters**
-
-    SELECT * FROM dfs.`/Users/max/data/logs` WHERE (dir0 = '2013' AND dir1 = 
'1') OR (dir0 = '2014' AND dir1 = '2')
-


http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/manage/005-monitor-cancel.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/005-monitor-cancel.md 
b/_docs/drill-docs/manage/005-monitor-cancel.md
deleted file mode 100644
index 6888eea..0000000
--- a/_docs/drill-docs/manage/005-monitor-cancel.md
+++ /dev/null
@@ -1,30 +0,0 @@
----
-title: "Monitoring and Canceling Queries in the Drill Web UI"
-parent: "Manage Drill"
----
-You can monitor and cancel queries from the Drill Web UI. To access the Drill
-Web UI, the Drillbit process must be running on the Drill node that you use to
-access the Drill Web UI.
-
-To monitor or cancel a query from the Drill Web UI, complete the following
-steps:
-
-  1. Navigate to the Drill Web UI at `<drill_node_ip_address>:8047.`  
-When you access the Drill Web UI, you see some general information about Drill
-running in your cluster, such as the nodes running the Drillbit process, the
-various ports Drill is using, and the amount of direct memory assigned to
-Drill.  
-![](../../img/7.png)
-
-  2. Select **Profiles** in the toolbar. A list of running and completed 
queries appears. Drill assigns a query ID to each query and lists the Foreman 
node. The Foreman is the Drillbit node that receives the query from the client 
or application. The Foreman drives the entire query.  
-![](../../img/51.png)
-
-  3. Click the **Query ID** for the query that you want to monitor or cancel. 
The Query and Planning window appears.  
-![](../../img/4.png)
-
-  4. Select **Edit Query**.
-  5. Click **Cancel query **to cancel the** **query. The following message 
appears:  
-![](../../img/46.png)
-
-  6. Optionally, you can re-run the query to see a query summary in this 
window.
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/manage/conf/001-mem-alloc.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/conf/001-mem-alloc.md 
b/_docs/drill-docs/manage/conf/001-mem-alloc.md
deleted file mode 100644
index 4508935..0000000
--- a/_docs/drill-docs/manage/conf/001-mem-alloc.md
+++ /dev/null
@@ -1,31 +0,0 @@
----
-title: "Memory Allocation"
-parent: "Configuration Options"
----
-You can configure the amount of direct memory allocated to a Drillbit for
-query processing. The default limit is 8G, but Drill prefers 16G or more
-depending on the workload. The total amount of direct memory that a Drillbit
-allocates to query operations cannot exceed the limit set.
-
-Drill mainly uses Java direct memory and performs well when executing
-operations in memory instead of storing the operations on disk. Drill does not
-write to disk unless absolutely necessary, unlike MapReduce where everything
-is written to disk during each phase of a job.
-
-The JVMâs heap memory does not limit the amount of direct memory available in
-a Drillbit. The on-heap memory for Drill is only about 4-8G, which should
-suffice because Drill avoids having data sit in heap memory.
-
-#### Modifying Drillbit Memory
-
-You can modify memory for each Drillbit node in your cluster. To modify the
-memory for a Drillbit, edit the `XX:MaxDirectMemorySize` parameter in the
-Drillbit startup script located in `<drill_installation_directory>/conf/drill-
-env.sh`.
-
-**Note:** If this parameter is not set, the limit depends on the amount of 
available system memory.
-
-After you edit `<drill_installation_directory>/conf/drill-env.sh`, [restart
-the Drillbit
-](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=44994063)on
-the node.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/manage/conf/002-startup-opt.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/conf/002-startup-opt.md 
b/_docs/drill-docs/manage/conf/002-startup-opt.md
deleted file mode 100644
index 923139f..0000000
--- a/_docs/drill-docs/manage/conf/002-startup-opt.md
+++ /dev/null
@@ -1,50 +0,0 @@
----
-title: "Start-Up Options"
-parent: "Configuration Options"
----
-Drillâs start-up options reside in a HOCON configuration file format, which 
is
-a hybrid between a properties file and a JSON file. Drill start-up options
-consist of a group of files with a nested relationship. At the core of the
-file hierarchy is `drill-default.conf`. This file is overridden by one or more
-`drill-module.conf` files, which are overridden by the `drill-override.conf`
-file that you define.
-
-You can see the following group of files throughout the source repository in
-Drill:
-
-       common/src/main/resources/drill-default.conf
-       common/src/main/resources/drill-module.conf
-       contrib/storage-hbase/src/main/resources/drill-module.conf
-       contrib/storage-hive/core/src/main/resources/drill-module.conf
-       
contrib/storage-hive/hive-exec-shade/src/main/resources/drill-module.conf
-       exec/java-exec/src/main/resources/drill-module.conf
-       distribution/src/resources/drill-override.conf
-
-These files are listed inside the associated JAR files in the Drill
-distribution tarball.
-
-Each Drill module has a set of options that Drill incorporates. Drillâs
-modular design enables you to create new storage plugins, set new operators,
-or create UDFs. You can also include additional configuration options that you
-can override as necessary.
-
-When you add a JAR file to Drill, you must include a `drill-module.conf` file
-in the root directory of the JAR file that you add. The `drill-module.conf`
-file tells Drill to scan that JAR file or associated object and include it.
-
-#### Viewing Startup Options
-
-You can run the following query to see a list of Drillâs startup options:
-
-    SELECT * FROM sys.options WHERE type='BOOT'
-
-#### Configuring Start-Up Options
-
-You can configure start-up options for each Drillbit in the `drill-
-override.conf` file located in Drillâs` /conf` directory.
-
-You may want to configure the following start-up options that control certain
-behaviors in Drill:
-
-<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th 
class="confluenceTh">Option</th><th class="confluenceTh">Default Value</th><th 
class="confluenceTh">Description</th></tr><tr><td valign="top" 
class="confluenceTd"><p>drill.exec.sys.store.provider</p></td><td valign="top" 
class="confluenceTd"><p>ZooKeeper</p></td><td valign="top" 
class="confluenceTd"><p>Defines the persistent storage (PStore) provider. The 
PStore holds configuration and profile data. For more information about 
PStores, see <a 
href="https://cwiki.apache.org/confluence/display/DRILL/Persistent+Configuration+Storage";
 rel="nofollow">Persistent Configuration Storage</a>.</p></td></tr><tr><td 
valign="top" class="confluenceTd"><p>drill.exec.buffer.size</p></td><td 
valign="top" class="confluenceTd"><p> </p></td><td valign="top" 
class="confluenceTd"><p>Defines the amount of memory available, in terms of 
record batches, to hold data on the downstream side of an operation. Drill 
pushes data downstream as quic
 kly as possible to make data immediately available. This requires Drill to use 
memory to hold the data pending operations. When data on a downstream operation 
is required, that data is immediately available so Drill does not have to go 
over the network to process it. Providing more memory to this option increases 
the speed at which Drill completes a query.</p></td></tr><tr><td valign="top" 
class="confluenceTd"><p>drill.exec.sort.external.directories</p><p>drill.exec.sort.external.fs</p></td><td
 valign="top" class="confluenceTd"><p> </p></td><td valign="top" 
class="confluenceTd"><p>These options control spooling. The 
drill.exec.sort.external.directories option tells Drill which directory to use 
when spooling. The drill.exec.sort.external.fs option tells Drill which file 
system to use when spooling beyond memory files. <span style="line-height: 
1.4285715;background-color: transparent;"> </span></p><p>Drill uses a spool and 
sort operation for beyond memory operations. The sorting opera
 tion is designed to spool to a Hadoop file system. The default Hadoop file 
system is a local file system in the /tmp directory. Spooling performance (both 
writing and reading back from it) is constrained by the file system. <span 
style="line-height: 1.4285715;background-color: transparent;"> 
</span></p><p>For MapR clusters, use MapReduce volumes or set up local volumes 
to use for spooling purposes. Volumes improve performance and stripe data 
across as many disks as possible.</p></td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd"><p>drill.exec.debug.error_on_leak</p></td><td valign="top" 
colspan="1" class="confluenceTd"><p>True</p></td><td valign="top" colspan="1" 
class="confluenceTd"><p>Determines how Drill behaves when memory leaks occur 
during a query. By default, this option is enabled so that queries fail when 
memory leaks occur. If you disable the option, Drill issues a warning when a 
memory leak occurs and completes the query.</p></td></tr><tr><td valign="top" 
cols
 pan="1" class="confluenceTd"><p>drill.exec.zk.connect</p></td><td valign="top" 
colspan="1" class="confluenceTd"><p>localhost:2181</p></td><td valign="top" 
colspan="1" class="confluenceTd"><p>Provides Drill with the ZooKeeper quorum to 
use to connect to data sources. Change this setting to point to the ZooKeeper 
quorum that you want Drill to use. You must configure this option on each 
Drillbit node.</p></td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd"><p>drill.exec.cluster-id</p></td><td valign="top" 
colspan="1" class="confluenceTd"><p>my_drillbit_cluster</p></td><td 
valign="top" colspan="1" class="confluenceTd"><p>Identifies the cluster that 
corresponds with the ZooKeeper quorum indicated. It also provides Drill with 
the name of the cluster used during UDP multicast. You must change the default 
cluster-id if there are multiple clusters on the same subnet. If you do not 
change the ID, the clusters will try to connect to each other to create one 
cluster.</p></td></tr></t
 body></table></div>
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/manage/conf/003-plan-exec.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/conf/003-plan-exec.md 
b/_docs/drill-docs/manage/conf/003-plan-exec.md
deleted file mode 100644
index c045e8f..0000000
--- a/_docs/drill-docs/manage/conf/003-plan-exec.md
+++ /dev/null
@@ -1,37 +0,0 @@
----
-title: "Planning and Execution Options"
-parent: "Configuration Options"
----
-You can set Drill query planning and execution options per cluster, at the
-system or session level. Options set at the session level only apply to
-queries that you run during the current Drill connection. Options set at the
-system level affect the entire system and persist between restarts. Session
-level settings override system level settings.
-
-#### Querying Planning and Execution Options
-
-You can run the following query to see a list of the system and session
-planning and execution options:
-
-    SELECT name FROM sys.options WHERE type in ('SYSTEM','SESSION');
-
-#### Configuring Planning and Execution Options
-
-Use the` ALTER SYSTEM` or `ALTER SESSION` commands to set options. Typically,
-you set the options at the session level unless you want the setting to
-persist across all sessions.
-
-The following table contains planning and execution options that you can set
-at the system or session level:
-
-<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th 
class="confluenceTh">Option name</th><th class="confluenceTh">Default 
value</th><th class="confluenceTh">Description</th></tr><tr><td valign="top" 
colspan="1" class="confluenceTd">exec.errors.verbose</td><td valign="top" 
colspan="1" class="confluenceTd"><p>false</p></td><td valign="top" colspan="1" 
class="confluenceTd"><p>This option enables or disables the verbose message 
that Drill returns when a query fails. When enabled, Drill provides additional 
information about failed queries.</p></td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd"><span>exec.max_hash_table_size</span></td><td valign="top" 
colspan="1" class="confluenceTd">1073741824</td><td valign="top" colspan="1" 
class="confluenceTd"><span>The default maximum size for hash 
tables.</span></td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd">exec.min_hash_table_size</td><td valign="top" colspan="1" 
class="confluenceTd">65536</td><td val
 ign="top" colspan="1" class="confluenceTd">The default starting size for hash 
tables. Increasing this size is useful for very large aggregations or joins 
when you have large amounts of memory for Drill to use. Drill can spend a lot 
of time resizing the hash table as it finds new data. If you have large data 
sets, you can increase this hash table size to increase 
performance.</td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd">planner.add_producer_consumer</td><td valign="top" 
colspan="1" class="confluenceTd"><p>false</p><p> </p></td><td valign="top" 
colspan="1" class="confluenceTd"><p>This option enables or disables a secondary 
reading thread that works out of band of the rest of the scanning fragment to 
prefetch data from disk. <span style="line-height: 1.4285715;background-color: 
transparent;">If you interact with a certain type of storage medium that is 
slow or does not prefetch much data, this option tells Drill to add a producer 
consumer reading thread to the operati
 on. Drill can then assign one thread that focuses on a single reading 
fragment. </span></p><p>If Drill is using memory, you can disable this option 
to get better performance. If Drill is using disk space, you should enable this 
option and set a reasonable queue size for the 
planner.producer_consumer_queue_size option.</p></td></tr><tr><td valign="top" 
colspan="1" class="confluenceTd">planner.broadcast_threshold</td><td 
valign="top" colspan="1" class="confluenceTd">1000000</td><td valign="top" 
colspan="1" class="confluenceTd"><span style="color: rgb(34,34,34);">Threshold, 
in terms of a number of rows, that determines whether a broadcast join is 
chosen for a query. Regardless of the setting of the broadcast_join option 
(enabled or disabled), a broadcast join is not chosen unless the right side of 
the join is estimated to contain fewer rows than this threshold. The intent of 
this option is to avoid broadcasting too many rows for join purposes. 
Broadcasting involves sending data across 
 nodes and is a network-intensive operation. (The &quot;right side&quot; of the 
join, which may itself be a join or simply a table, is determined by cost-based 
optimizations and heuristics during physical planning.)</span></td></tr><tr><td 
valign="top" colspan="1" 
class="confluenceTd"><p>planner.enable_broadcast_join<br 
/>planner.enable_hashagg<br />planner.enable_hashjoin<br 
/>planner.enable_mergejoin<br />planner.enable_multiphase_agg<br 
/>planner.enable_streamagg</p></td><td valign="top" colspan="1" 
class="confluenceTd">true</td><td valign="top" colspan="1" 
class="confluenceTd"><p>These options enable or disable specific aggregation 
and join operators for queries. These operators are all enabled by default and 
in general should not be disabled.</p><p>Hash aggregation and hash join are 
hash-based operations. Streaming aggregation and merge join are sort-based 
operations. Both hash-based and sort-based operations consume memory; however, 
currently, hash-based operations do not spill
  to disk as needed, but the sort-based operations do. If large hash operations 
do not fit in memory on your system, you may need to disable these operations. 
Queries will continue to run, using alternative plans.</p></td></tr><tr><td 
valign="top" colspan="1" 
class="confluenceTd">planner.producer_consumer_queue_size</td><td valign="top" 
colspan="1" class="confluenceTd">10</td><td valign="top" colspan="1" 
class="confluenceTd">Determines how much data to prefetch from disk (in record 
batches) out of band of query execution. The larger the queue size, the greater 
the amount of memory that the queue and overall query execution 
consumes.</td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd">planner.slice_target</td><td valign="top" colspan="1" 
class="confluenceTd">100000</td><td valign="top" colspan="1" 
class="confluenceTd">The number of records manipulated within a fragment before 
Drill parallelizes them.</td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd"><p>planner.
 width.max_per_node</p><p> </p></td><td valign="top" colspan="1" 
class="confluenceTd"><p>The default depends on the number of cores on each 
node.</p></td><td valign="top" colspan="1" class="confluenceTd"><p>In this 
context &quot;width&quot; refers to fanout or distribution potential: the 
ability to run a query in parallel across the cores on a node and the nodes on 
a cluster.</p><p><span>A physical plan consists of intermediate operations, 
known as query &quot;fragments,&quot; that run concurrently, yielding 
opportunities for parallelism above and below each exchange operator in the 
plan. An exchange operator represents a breakpoint in the execution flow where 
processing can be distributed. For example, a single-process scan of a file may 
flow into an exchange operator, followed by a multi-process aggregation 
fragment.</span><span> </span></p><p>The maximum width per node defines the 
maximum degree of parallelism for any fragment of a query, but the setting 
applies at the level of a 
 single node in the cluster.</p><p>The <em>default</em> maximum degree of 
parallelism per node is calculated as follows, with the theoretical maximum 
automatically scaled back (and rounded down) so that only 70% of the actual 
available capacity is taken into account:</p><div class="code panel pdl" 
style="border-width: 1px;"><div class="codeContent panelContent pdl">
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: 
false"><![CDATA[number of active drillbits (typically one per node) 
-* number of cores per node
-* 0.7]]></script>
-</div></div><p>For example, on a single-node test system with 2 cores and 
hyper-threading enabled:</p><div class="code panel pdl" style="border-width: 
1px;"><div class="codeContent panelContent pdl">
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: 
false"><![CDATA[1 * 4 * 0.7 = 3]]></script>
-</div></div><p>When you modify the default setting, you can supply any 
meaningful number. The system does not automatically scale down your 
setting.</p></td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd">planner.width.max_per_query</td><td valign="top" 
colspan="1" class="confluenceTd">1000</td><td valign="top" colspan="1" 
class="confluenceTd"><p>The max_per_query value also sets the maximum degree of 
parallelism for any given stage of a query, but the setting applies to the 
query as executed by the whole cluster (multiple nodes). In effect, the actual 
maximum width per query is the <em>minimum of two values</em>:</p><div 
class="code panel pdl" style="border-width: 1px;"><div class="codeContent 
panelContent pdl">
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: 
false"><![CDATA[min((number of nodes * width.max_per_node), 
width.max_per_query)]]></script>
-</div></div><p>For example, on a 4-node cluster where 
<span><code>width.max_per_node</code> is set to 6 and 
</span><span><code>width.max_per_query</code> is set to 30:</span></p><div 
class="code panel pdl" style="border-width: 1px;"><div class="codeContent 
panelContent pdl">
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: 
false"><![CDATA[min((4 * 6), 30) = 24]]></script>
-</div></div><p>In this case, the effective maximum width per query is 24, not 
30.</p></td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd">store.format</td><td valign="top" colspan="1" 
class="confluenceTd"> </td><td valign="top" colspan="1" 
class="confluenceTd">Output format for data that is written to tables with the 
CREATE TABLE AS (CTAS) command.</td></tr><tr><td valign="top" colspan="1" 
class="confluenceTd">store.json.all_text_mode</td><td valign="top" colspan="1" 
class="confluenceTd"><p>false</p></td><td valign="top" colspan="1" 
class="confluenceTd"><p>This option enables or disables text mode. When 
enabled, Drill reads everything in JSON as a text object instead of trying to 
interpret data types. This allows complicated JSON to be read using CASE and 
CAST.</p></td></tr><tr><td valign="top" 
class="confluenceTd">store.parquet.block-size</td><td valign="top" 
class="confluenceTd"><p>536870912</p></td><td valign="top" 
class="confluenceTd">T<span style="color: rgb(34,34,34
 );">arget size for a parquet row group, which should be equal to or less than 
the configured HDFS block size. </span></td></tr></tbody></table></div>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/manage/conf/004-persist-conf.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/conf/004-persist-conf.md 
b/_docs/drill-docs/manage/conf/004-persist-conf.md
deleted file mode 100644
index 99c22b5..0000000
--- a/_docs/drill-docs/manage/conf/004-persist-conf.md
+++ /dev/null
@@ -1,93 +0,0 @@
----
-title: "Persistent Configuration Storage"
-parent: "Configuration Options"
----
-Drill stores persistent configuration data in a persistent configuration store
-(PStore). This data is encoded in JSON or Protobuf format. Drill can use the
-local file system, ZooKeeper, HBase, or MapR-DB to store this data. The data
-stored in a PStore includes state information for storage plugins, query
-profiles, and ALTER SYSTEM settings. The default type of PStore configured
-depends on the Drill installation mode.
-
-The following table provides the persistent storage mode for each of the Drill
-modes:
-
-<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th 
class="confluenceTh">Mode</th><th 
class="confluenceTh">Description</th></tr><tr><td valign="top" 
class="confluenceTd">Embedded</td><td valign="top" class="confluenceTd">Drill 
stores persistent data in the local file system. <br />You cannot modify the 
PStore location for Drill in embedded mode.</td></tr><tr><td valign="top" 
class="confluenceTd">Distributed</td><td valign="top" 
class="confluenceTd">Drill stores persistent data in ZooKeeper, by default. <br 
/>You can modify where ZooKeeper offloads data, <br />or you can change the 
persistent storage mode to HBase or MapR-DB.</td></tr></tbody></table></div>
-  
-**Note:** Switching between storage modes does not migrate configuration data.
-
-## ZooKeeper for Persistent Configuration Storage
-
-To make Drill installation and configuration simple, Drill uses ZooKeeper to
-store persistent configuration data. The ZooKeeper PStore provider stores all
-of the persistent configuration data in ZooKeeper except for query profile
-data.
-
-The ZooKeeper PStore provider offloads query profile data to the
-${DRILL_LOG_DIR:-/var/log/drill} directory on Drill nodes. If you want the
-query profile data stored in a specific location, you can configure where
-ZooKeeper offloads the data.
-
-To modify where the ZooKeeper PStore provider offloads query profile data,
-configure the `sys.store.provider.zk.blobroot` property in the `drill.exec`
-block in `<drill_installation_directory>/conf/drill-override.conf` on each
-Drill node and then restart the Drillbit service.
-
-**Example**
-
-       drill.exec: {
-        cluster-id: "my_cluster_com-drillbits",
-        zk.connect: "<zkhostname>:<port>",
-        sys.store.provider.zk.blobroot: "maprfs://<directory to store pstore 
data>/"
-       }
-
-Issue the following command to restart the Drillbit on all Drill nodes:
-
-    maprcli node services -name drill-bits -action restart -nodes <node IP 
addresses separated by a space>
-
-## HBase for Persistent Configuration Storage
-
-To change the persistent storage mode for Drill, add or modify the
-`sys.store.provider` block in `<drill_installation_directory>/conf/drill-
-override.conf.`
-
-**Example**
-
-       sys.store.provider: {
-           class: 
"org.apache.drill.exec.store.hbase.config.HBasePStoreProvider",
-           hbase: {
-             table : "drill_store",
-             config: {
-             "hbase.zookeeper.quorum": "<ip_address>,<ip_address>,<ip_address 
>,<ip_address>",
-             "hbase.zookeeper.property.clientPort": "2181"
-             }
-           }
-         },
-
-## MapR-DB for Persistent Configuration Storage
-
-The MapR-DB plugin will be released soon. You can [compile Drill from
-source](/confluence/display/DRILL/Compiling+Drill+from+Source) to try out this
-new feature.
-
-If you have MapR-DB in your cluster, you can use MapR-DB for persistent
-configuration storage. Using MapR-DB to store persistent configuration data
-can prevent memory strain on ZooKeeper in clusters running heavy workloads.
-
-To change the persistent storage mode to MapR-DB, add or modify the
-`sys.store.provider` block in `<drill_installation_directory>/conf/drill-
-override.conf` on each Drill node and then restart the Drillbit service.
-
-**Example**
-
-       sys.store.provider: {
-       class: "org.apache.drill.exec.store.hbase.config.HBasePStoreProvider",
-       hbase: {
-         table : "/tables/drill_store",
-           }
-       },
-
-Issue the following command to restart the Drillbit on all Drill nodes:
-
-    maprcli node services -name drill-bits -action restart -nodes <node IP 
addresses separated by a space>
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/progress/001-2014-q1.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/progress/001-2014-q1.md 
b/_docs/drill-docs/progress/001-2014-q1.md
deleted file mode 100644
index 1909753..0000000
--- a/_docs/drill-docs/progress/001-2014-q1.md
+++ /dev/null
@@ -1,204 +0,0 @@
----
-title: "2014 Q1 Drill Report"
-parent: "Progress Reports"
----
-
-Apache: Project Drill
-
-Description:
-
-Apache Drill is a distributed system for interactive analysis of large-scale
-datasets that is based on Google's Dremel. Its goal is to efficiently process
-nested data, scale to 10,000 servers or
-
-more and to be able to process petabyes of data and trillions of records in
-seconds.
-
-Drill has been incubating since 2012-08-11.
-
-Three Issues to Address in Move to Graduation:
-
-1\. Continue to attract new developers and and early users with a variety of
-skills and viewpoints
-
-2\. Continue to develop deeper community skills and knowledge by building
-additional releases
-
-3\. Demonstrate community robustness by rotating project tasks among multiple
-project members
-
-Issues to Call to Attention of PMC or ASF Board:
-
-None
-
-How community has developed since last report:
-
-Community awareness and participation were strengthened through a meeting of
-the Bay Area Apache Drill User Group in San Jose
-
-sponsored by Yahoo! This event expanded participation to include many new to
-Drill and particularly those interested as potential users (analysts rather
-than developers).
-
-Speakers included Drill project mentor Ted Dunning from MapR, Data Scientist
-Will Ford from Alpine Data Labs, new Drill committer Julian Hyde from
-HortonWorks and Aman Sinha, MapR Drill engineer.
-
-Additional events include:
-
-â¢ Two new Drill committers accepted appointment:
-
-Julian Hyde (HortonWorks) and Tim Chen (Microsoft).
-
-â¢ Drill has a new project mentor, Sebastian Schelter.
-
-Mailing list discussions:
-
-Subscriptions to the Drill mailing lists have risen to 399 on dev list and 308
-on the user list and 508 uniques across both lists.
-
-There has been active and increasing participation in discussions on the
-developer mailing list, including new participants and
-
-developers. Participation on the user list is growing although still small;
-mainly activity takes place on developer mailing list.
-
-Activity summary for the user mailing list:
-
-<http://mail-archives.apache.org/mod_mbox/incubator-drill-user/>
-
-February to date 02/26/2014: 25
-
-January 2014, 12
-
-December 2013, 62
-
-Topics in discussion on the user mailing list included but not limited to:
-
-  * Feb 2014: Connecting Drill to HBase, Support for Distinct/Count
-  * Jan 2014: Loading Data into Drill, Data Locality
-  * December 2013: Loading Data into Drill, Setting Drill with HDFS and other 
Storage engines
-
-Activity summary for the dev mailing list:
-
-<http://mail-archives.apache.org/mod_mbox/incubator-drill-dev/>
-
-February to date 02/26/2014: 250 (jira; discussion; review
-
-requests)
-
-January2014, 156(jira, focused discussions)
-
-December 2013, 51 (jira; focused discussions)
-
-Topics in discussion on the dev mailing list included but not
-
-limited to:
-
-â¢ February to date 02/26/2014: How to contribute to Drill;
-
-review requests for Drill 357, 346, 366, 364; status of
-
-Drill functions including Hash functions; support operators
-
-+,- for date and interval arithmetic
-
-â¢ January: Sql Options discussions, Casting discussions, Multiplex Data
-Channel feedbacks
-
-â¢ December: Guide for new comers contribution, Aggregate functions code gen
-feedback
-
-Code
-
-For details of code commits, see <http://bit.ly/14YPXN9>
-
-There has been continued activity in code commits
-
-19 contributors have participated in GitHUB code activity; there
-
-have been 116 forks.
-
-February code commits include but not limited to: Support for
-Information_schema, Hive storage and metastore integration, Optiq JDBC
-thinning and refactoring, Math functions rework to use codegen, Column pruning
-for Parquet/Json, Moving Sql parsing into Drillbit server side, TravisCI setup
-
-January code commits include but not limited to: Implicit and explicit casting
-support, Broadcast Sender exchange, add TPC-H test queries, Refactor memory
-allocation to use hierarchical memory allocation and freeing.
-
-Community Interactions
-
-Weekly Drill hangout continues, conducted remotely through Google hangouts
-Tuesday mornings 9am Pacific Time to keep
-
-core developers in contact in realtime despite geographical separation.
-
-Community stays in touch through @ApacheDrill Twitter ID, and by postings on
-various blogs including Apache Drill User <http://drill-user.org/> which has
-had several updates and through
-
-international presentations at conferences.
-
-Viability of community is also apparent through active participation in the
-Bay Area Apache Drill User group meeting in early November, which has grown to
-440 members.
-
-Sample presentations:
-
-â¢ âHow to Use Drillâ by Ted Dunning and Will Ford, Bay Area Apache Drill 
Meet-
-up 24 February
-
-â¢ âHow Drill Addresses Dynamic Typingâ by Julian Hyde, Bay Area Apache 
Drill
-Meet-up 24 February
-
-â¢ âNew Features and Infrastructure Improvementsâ by Aman Sinha, Bay Area
-Apache Drill Meet-up 24 February
-
-Articles
-
-Examples of articles or reports on Apache Drill since last report
-
-include:
-
-â¢ Drill blog post by Ellen Friedman at Apache Drill User updating community 
on
-how people will use Drill and inviting comments/ questions from remote
-participants as
-
-part of the Drill User Group <http://bit.ly/1p1Qvgn>
-
-â¢ Drill blog post by Ellen Friedman at Apache Drill User reports on
-appointment of new Drill committers and new mentor <http://bit.ly/JIcwQe>
-
-Social Networking
-
-@ApacheDrill Twitter entity is active and has grown substantially
-
-by 19%, to 744 followers.
-
-How project has developed since last report:
-
-1\. Significant progress is being made on execution engine and sql front end
-to support more functionality, also more integrations with storage engines.
-
-2\. Work on ODBC driver has begun with a new group led by George Chow in
-Vancouver.
-
-3\. Significant code drops have been checked in from a number of contributors
-and commiters
-
-4\. Work toward 2nd milestone is progressing substantially.
-
-Please check this [ ] when you have filled in the report for Drill.
-
-Signed-off-by:
-
-Ted Dunning: [](drill)
-
-Grant Ingersoll: [ ](drill)
-
-Isabel Drost: [ ](drill)
-
-Sebastian Schelter: [ ](drill)
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/001-query-fs.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/001-query-fs.md 
b/_docs/drill-docs/query/001-query-fs.md
deleted file mode 100644
index bd69c17..0000000
--- a/_docs/drill-docs/query/001-query-fs.md
+++ /dev/null
@@ -1,44 +0,0 @@
----
-title: "Querying a File System"
-parent: "Query"
----
-Files and directories are like standard SQL tables to Drill. You can specify a
-file system "database" as a prefix in queries when you refer to objects across
-databases. In Drill, a file system database consists of a storage plugin name
-followed by an optional workspace name, for example <storage
-plugin>.<workspace> or hdfs.logs.
-
-  
-
-The following example shows a query on a file system database in a Hadoop
-distributed file system:
-
-``SELECT * FROM hdfs.logs.`AppServerLogs/20104/Jan/01/part0001.txt`;``
-
-The default `dfs` storage plugin instance registered with Drill has a
-`default` workspace. If you query data in the `default` workspace, you do not
-need to include the workspace in the query. Refer to
-[Workspaces](https://cwiki.apache.org/confluence/display/DRILL/Workspaces) for
-more information.
-
-Drill supports the following file types:
-
-  * Plain text files, including:
-    * Comma-separated values (CSV, type: text)
-    * Tab-separated values (TSV, type: text)
-    * Pipe-separated values (PSV, type: text)
-  * Structured data files:
-    * JSON (type: json)
-    * Parquet (type: parquet)
-
-The extensions for these file types must match the configuration settings for
-your registered storage plugins. For example, PSV files may be defined with a
-`.tbl` extension, while CSV files are defined with a `.csv` extension.
-
-Click on any of the following links for more information about querying
-various file types:
-
-  * [Querying JSON Files](/confluence/display/DRILL/Querying+JSON+Files)
-  * [Querying Parquet Files](/confluence/display/DRILL/Querying+Parquet+Files)
-  * [Querying Plain Text 
Files](/confluence/display/DRILL/Querying+Plain+Text+Files)
-  * [Querying Directories](/confluence/display/DRILL/Querying+Directories)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/002-query-hbase.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/002-query-hbase.md 
b/_docs/drill-docs/query/002-query-hbase.md
deleted file mode 100644
index 2fb0a10..0000000
--- a/_docs/drill-docs/query/002-query-hbase.md
+++ /dev/null
@@ -1,177 +0,0 @@
----
-title: "Querying HBase"
-parent: "Query"
----
-This is a simple exercise that provides steps for creating a âstudentsâ 
table
-and a âclicksâ table in HBase that you can query with Drill.
-
-To create the HBase tables and query them with Drill, complete the following
-steps:
-
-  1. Issue the following command to start the HBase shell:
-  
-        hbase shell
-
-  2. Issue the following commands to create a âstudentsâ table and a 
âclicksâ table with column families in HBase:  
-
-    
-      ``echo "create 'students','account','address'" | hbase shell``
-    
-        ``echo "create 'clicks','clickinfo','iteminfo'" | hbase shell``
-
-  3. Issue the following command with the provided data to create a 
`testdata.txt` file:  
-
-     `cat > testdata.txt`
-
-     **Sample Data**
-
-        put 'students','student1','account:name','Alice'
-        put 'students','student1','address:street','123 Ballmer Av'
-        put 'students','student1','address:zipcode','12345'
-        put 'students','student1','address:state','CA'
-        put 'students','student2','account:name','Bob'
-        put 'students','student2','address:street','1 Infinite Loop'
-        put 'students','student2','address:zipcode','12345'
-        put 'students','student2','address:state','CA'
-        put 'students','student3','account:name','Frank'
-        put 'students','student3','address:street','435 Walker Ct'
-        put 'students','student3','address:zipcode','12345'
-        put 'students','student3','address:state','CA'
-        put 'students','student4','account:name','Mary'
-        put 'students','student4','address:street','56 Southern Pkwy'
-        put 'students','student4','address:zipcode','12345'
-        put 'students','student4','address:state','CA'
-        put 'clicks','click1','clickinfo:studentid','student1'
-        put 'clicks','click1','clickinfo:url','http://www.google.com'
-        put 'clicks','click1','clickinfo:time','2014-01-01 12:01:01.0001'
-        put 'clicks','click1','iteminfo:itemtype','image'
-        put 'clicks','click1','iteminfo:quantity','1'
-        put 'clicks','click2','clickinfo:studentid','student1'
-        put 'clicks','click2','clickinfo:url','http://www.amazon.com'
-        put 'clicks','click2','clickinfo:time','2014-01-01 01:01:01.0001'
-        put 'clicks','click2','iteminfo:itemtype','image'
-        put 'clicks','click2','iteminfo:quantity','1'
-        put 'clicks','click3','clickinfo:studentid','student2'
-        put 'clicks','click3','clickinfo:url','http://www.google.com'
-        put 'clicks','click3','clickinfo:time','2014-01-01 01:02:01.0001'
-        put 'clicks','click3','iteminfo:itemtype','text'
-        put 'clicks','click3','iteminfo:quantity','2'
-        put 'clicks','click4','clickinfo:studentid','student2'
-        put 'clicks','click4','clickinfo:url','http://www.ask.com'
-        put 'clicks','click4','clickinfo:time','2013-02-01 12:01:01.0001'
-        put 'clicks','click4','iteminfo:itemtype','text'
-        put 'clicks','click4','iteminfo:quantity','5'
-        put 'clicks','click5','clickinfo:studentid','student2'
-        put 'clicks','click5','clickinfo:url','http://www.reuters.com'
-        put 'clicks','click5','clickinfo:time','2013-02-01 12:01:01.0001'
-        put 'clicks','click5','iteminfo:itemtype','text'
-        put 'clicks','click5','iteminfo:quantity','100'
-        put 'clicks','click6','clickinfo:studentid','student3'
-        put 'clicks','click6','clickinfo:url','http://www.google.com'
-        put 'clicks','click6','clickinfo:time','2013-02-01 12:01:01.0001'
-        put 'clicks','click6','iteminfo:itemtype','image'
-        put 'clicks','click6','iteminfo:quantity','1'
-        put 'clicks','click7','clickinfo:studentid','student3'
-        put 'clicks','click7','clickinfo:url','http://www.ask.com'
-        put 'clicks','click7','clickinfo:time','2013-02-01 12:45:01.0001'
-        put 'clicks','click7','iteminfo:itemtype','image'
-        put 'clicks','click7','iteminfo:quantity','10'
-        put 'clicks','click8','clickinfo:studentid','student4'
-        put 'clicks','click8','clickinfo:url','http://www.amazon.com'
-        put 'clicks','click8','clickinfo:time','2013-02-01 22:01:01.0001'
-        put 'clicks','click8','iteminfo:itemtype','image'
-        put 'clicks','click8','iteminfo:quantity','1'
-        put 'clicks','click9','clickinfo:studentid','student4'
-        put 'clicks','click9','clickinfo:url','http://www.amazon.com'
-        put 'clicks','click9','clickinfo:time','2013-02-01 22:01:01.0001'
-        put 'clicks','click9','iteminfo:itemtype','image'
-        put 'clicks','click9','iteminfo:quantity','10'
-
-  4. Issue the following command to verify that the data is in the 
`testdata.txt` file:  
-    
-     `cat testdata.txt | hbase shell`
-
-  5. Issue `exit` to leave the `hbase shell`.
-  6. Start Drill. Refer to [Starting/Stopping 
Drill](/confluence/pages/viewpage.action?pageId=44994063) for instructions.
-  7. Use Drill to issue the following SQL queries on the âstudentsâ and 
âclicksâ tables:
-    a. Issue the following query to see the data in the âstudentsâ table:  
-
-        ``SELECT * FROM hbase.`students`;``
-
-        The query returns binary results:
-
-        
-        `Query finished, fetching results ...`
-        
-        
`+----------+----------+----------+-----------+----------+----------+----------+-----------+`
-        
-        `|id    | name        | state       | street      | zipcode |`
-        
-        
`+----------+----------+----------+-----------+----------+-----------+----------+-----------`
-        
-        `| [B@1ee37126 | [B@661985a1 | [B@15944165 | [B@385158f4 | [B@3e08d131 
|`
-        
-        `| [B@64a7180e | [B@161c72c2 | [B@25b229e5 | [B@53dc8cb8 |[B@1d11c878 
|`
-        
-        `| [B@349aaf0b | [B@175a1628 | [B@1b64a812 | [B@6d5643ca |[B@147db06f 
|`
-        
-        `| [B@3a7cbada | [B@52cf5c35 | [B@2baec60c | [B@5f4c543b |[B@2ec515d6 
|`
-
-       Since Drill does not require metadata, you must use the SQL `CAST` 
function in
-some queries to get readable query results.
-
-    b. Issue the following query, that includes the `CAST` function, to see 
the data in the â`students`â table:
-
-       `SELECT CAST(students.clickinfo.studentid as VarChar(20)),
-CAST(students.account.name as VarChar(20)), CAST (students.address.state as
-VarChar(20)), CAST (students.address.street as VarChar(20)), CAST
-(students.address.zipcode as VarChar(20)), FROM hbase.students;`
-
-       **Note:** Use the following format when you query a column in an HBase 
table:   
-       `tablename.columnfamilyname.columnname`  
-       For more information about column families, refer to [5.6. Column
-Family](http://hbase.apache.org/book/columnfamily.html).
-
-       The query returns the data:
-
-        
-       `Query finished, fetching results ...`
-        
-       `+----------+-------+-------+------------------+---------+`
-        
-       `| studentid | name  | state | street           | zipcode |`
-        
-       `+----------+-------+-------+------------------+---------+`
-        
-       `| student1 | Alice | CA    | 123 Ballmer Av   | 12345   |`
-        
-       `| student2 | Bob   | CA    | 1 Infinite Loop  | 12345   |`
-        
-       `| student3 | Frank | CA    | 435 Walker Ct    | 12345   |`
-        
-       `| student4 | Mary  | CA    | 56 Southern Pkwy | 12345   |`
-        
-       `+----------+-------+-------+------------------+---------+`
-
-   c. Issue the following query on the âclicksâ table to find out which 
students clicked on google.com:
-        
-       ``SELECT CAST(clicks.clickinfo.studentid as VarChar(200)), 
CAST(clicks.clickinfo.url as VarChar(200)) FROM hbase.`clicks` WHERE URL LIKE 
'%google%';``
-
-       The query returns the data:
-
-        
-       `Query finished, fetching results ...`
-        
-       
`+---------+-----------+-------------------------------+-----------------------+----------+----------+`
-        
-       `| clickid | studentid | time                          | url            
       | itemtype | quantity |`
-        
-       
`+---------+-----------+-------------------------------+-----------------------+----------+----------+`
-        
-       `| click1  | student1  | 2014-01-01 12:01:01.000100000 | 
http://www.google.com | image    | 1        |`
-        
-       `| click3  | student2  | 2014-01-01 01:02:01.000100000 | 
http://www.google.com | text     | 2        |`
-        
-       `| click6  | student3  | 2013-02-01 12:01:01.000100000 | 
http://www.google.com | image    | 1        |`
-        
-       
`+---------+-----------+-------------------------------+-----------------------+----------+----------+`
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/003-query-hive.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/003-query-hive.md 
b/_docs/drill-docs/query/003-query-hive.md
deleted file mode 100644
index 7af069d..0000000
--- a/_docs/drill-docs/query/003-query-hive.md
+++ /dev/null
@@ -1,67 +0,0 @@
----
-title: "Querying Hive"
-parent: "Query"
----
-This is a simple exercise that provides steps for creating a Hive table and
-inserting data that you can query using Drill. Before you perform the steps,
-download the [customers.csv](http://doc.mapr.com/download/attachments/22906623
-/customers.csv?api=v2) file.
-
-To create a Hive table and query it with Drill, complete the following steps:
-
-  1. Issue the following command to start the Hive shell:
-  
-        hive
-
-  2. Issue the following command from the Hive shell to import the 
`customers.csv` file and create a table:
-  
-        hive> create table customers(FirstName string,
-        LastName string,Company string,Address string,
-        City string,County string,State string,Zip string,
-        Phone string,Fax string,Email string,Web string)
-        row format delimited fields terminated by ',' stored as textfile;
-
-  3. Issue the following command to load the customer data into the customers 
table:  
-
-     `Hive> load data local inpath '/<directory path>/customers.csv' overwrite 
into table customers;`
-
-  4. Issue `quit` or `exit` to leave the Hive shell.
-  5. Start Drill. Refer to [Starting/Stopping 
Drill](/confluence/pages/viewpage.action?pageId=44994063) for instructions.
-  6. Issue the following query to Drill to get the first and last names of the 
first ten customers in the Hive table:  
-
-     `0: jdbc:drill:schema=hiveremote> SELECT firstname,lastname FROM 
hiveremote.`customers` limit 10;`
-
-     The query returns the following results:
-     
-     `+------------+------------+`
-    
-     `| firstname  |  lastname  |`
-    
-     `+------------+------------+`
-    
-     `| Essie      | Vaill      |`
-    
-     `| Cruz       | Roudabush  |`
-    
-     `| Billie     | Tinnes     |`
-    
-     `| Zackary    | Mockus     |`
-    
-     `| Rosemarie  | Fifield    |`
-    
-     `| Bernard    | Laboy      |`
-    
-     `| Sue        | Haakinson  |`
-    
-     `| Valerie    | Pou        |`
-    
-     `| Lashawn    | Hasty      |`
-    
-     `| Marianne   | Earman     |`
-    
-     `+------------+------------+`
-    
-     `10 rows selected (1.5 seconds)`
-    
-     `0: jdbc:drill:schema=hiveremote>`
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/004-query-complex.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/004-query-complex.md 
b/_docs/drill-docs/query/004-query-complex.md
deleted file mode 100644
index 1685d62..0000000
--- a/_docs/drill-docs/query/004-query-complex.md
+++ /dev/null
@@ -1,63 +0,0 @@
----
-title: "Querying Complex Data"
-parent: "Query"
----
-Apache Drill queries do not require prior knowledge of the actual data you are
-trying to access, regardless of its source system or its schema and data
-types. The sweet spot for Apache Drill is a SQL query workload against
-"complex data": data made up of various types of records and fields, rather
-than data in a recognizable relational form (discrete rows and columns). Drill
-is capable of discovering the form of the data when you submit the query.
-Nested data formats such as JSON (JavaScript Object Notation) files and
-Parquet files are not only _accessible_: Drill provides special operators and
-functions that you can use to _drill down _into these files and ask
-interesting analytic questions.
-
-These operators and functions include:
-
-  * References to nested data values
-  * Access to repeating values in arrays and arrays within arrays (array 
indexes)
-
-The SQL query developer needs to know the data well enough to write queries
-that identify values of interest in the target file. For example, the writer
-needs to know what a record consists of, and its data types, in order to
-reliably request the right "columns" in the select list. Although these data
-values do not manifest themselves as columns in the source file, Drill will
-return them in the result set as if they had the predictable form of columns
-in a table. Drill also optimizes queries by treating the data as "columnar"
-rather than reading and analyzing complete records. (Drill uses similar
-parallel execution and optimization capabilities to commercial columnar MPP
-databases.)
-
-Given a basic knowledge of the input file, the developer needs to know how to
-use the SQL extensions that Drill provides and how to use them to "reach into"
-the nested data. The following examples show how to write both simple queries
-against JSON files and interesting queries that unpack the nested data. The
-examples show how to use the Drill extensions in the context of standard SQL
-SELECT statements. For the most part, the extensions use standard JavaScript
-notation for referencing data elements in a hierarchy.
-
-### Before You Begin
-
-The examples in this section operate on JSON data files. In order to write
-your own queries, you need to be aware of the basic data types in these files:
-
-  * string (all data inside double quotes), such as `"0001"` or `"Cake"`
-  * numeric types: integers, decimals, and floats, such as `0.55` or `10`
-  * null values
-  * boolean values: true, false
-
-Check that you have the following configuration setting for JSON files in the
-Drill Web UI (`dfs` storage plugin configuration):
-
-    "json" : {
-      "type" : "json"
-    }
-
-Click on any of the following links to see examples of complex queries:
-
-  * [Sample Data: Donuts](/confluence/display/DRILL/Sample+Data%3A+Donuts)
-  * [Query 1: Selecting Flat 
Data](/confluence/display/DRILL/Query+1%3A+Selecting+Flat+Data)
-  * [Query 2: Using Standard SQL Functions, Clauses, and 
Joins](/confluence/display/DRILL/Query+2%3A+Using+Standard+SQL+Functions%2C+Clauses%2C+and+Joins)
-  * [Query 3: Selecting Nested Data for a 
Column](/confluence/display/DRILL/Query+3%3A+Selecting+Nested+Data+for+a+Column)
-  * [Query 4: Selecting Multiple Columns Within Nested 
Data](/confluence/display/DRILL/Query+4%3A+Selecting+Multiple+Columns+Within+Nested+Data)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/005-query-info-skema.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/005-query-info-skema.md 
b/_docs/drill-docs/query/005-query-info-skema.md
deleted file mode 100644
index 5de4d4e..0000000
--- a/_docs/drill-docs/query/005-query-info-skema.md
+++ /dev/null
@@ -1,109 +0,0 @@
----
-title: "Querying the INFORMATION SCHEMA"
-parent: "Query"
----
-When you are using Drill to connect to multiple data sources, you need a
-simple mechanism to discover what each data source contains. The information
-schema is an ANSI standard set of metadata tables that you can query to return
-information about all of your Drill data sources (or schemas). Data sources
-may be databases or file systems; they are all known as "schemas" in this
-context. You can query the following INFORMATION_SCHEMA tables:
-
-  * SCHEMATA
-  * CATALOGS
-  * TABLES
-  * COLUMNS 
-  * VIEWS
-
-## SCHEMATA
-
-The SCHEMATA table contains the CATALOG_NAME and SCHEMA_NAME columns. To allow
-maximum flexibility inside BI tools, the only catalog that Drill supports is
-`DRILL`.
-
-    0: jdbc:drill:zk=local> select CATALOG_NAME, SCHEMA_NAME as 
all_my_data_sources from INFORMATION_SCHEMA.SCHEMATA order by SCHEMA_NAME;
-    +--------------+---------------------+
-    | CATALOG_NAME | all_my_data_sources |
-    +--------------+---------------------+
-    | DRILL        | INFORMATION_SCHEMA  |
-    | DRILL        | cp.default          |
-    | DRILL        | dfs.default         |
-    | DRILL        | dfs.root            |
-    | DRILL        | dfs.tmp             |
-    | DRILL        | HiveTest.SalesDB    |
-    | DRILL        | maprfs.logs         |
-    | DRILL        | sys                 |
-    +--------------+---------------------+
-
-The INFORMATION_SCHEMA name and associated keywords are case-sensitive. You
-can also return a list of schemas by running the SHOW DATABASES command:
-
-    0: jdbc:drill:zk=local> show databases;
-    +-------------+
-    | SCHEMA_NAME |
-    +-------------+
-    | dfs.default |
-    | dfs.root    |
-    | dfs.tmp     |
-    ...
-
-## CATALOGS
-
-The CATALOGS table returns only one row, with the hardcoded DRILL catalog name
-and description.
-
-## TABLES
-
-The TABLES table returns the table name and type for each table or view in
-your databases. (Type means TABLE or VIEW.) Note that Drill does not return
-files available for querying in file-based data sources. Instead, use SHOW
-FILES to explore these data sources.
-
-## COLUMNS
-
-The COLUMNS table returns the column name and other metadata (such as the data
-type) for each column in each table or view.
-
-## VIEWS
-
-The VIEWS table returns the name and definition for each view in your
-databases. Note that file schemas are the canonical repository for views in
-Drill. Depending on how you create a view, the may only be displayed in Drill
-after it has been used.
-
-## Useful Queries
-
-Run an ``INFORMATION_SCHEMA.`TABLES` ``query to view all of the tables and 
views
-within a database. TABLES is a reserved word in Drill and requires back ticks
-(`).
-
-For example, the following query identifies all of the tables and views that
-Drill can access:
-
-    SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE
-    FROM INFORMATION_SCHEMA.`TABLES`
-    ORDER BY TABLE_NAME DESC;
-    ----------------------------------------------------------------
-    TABLE_SCHEMA             TABLE_NAME            TABLE_TYPE
-    ----------------------------------------------------------------
-    HiveTest.CustomersDB     Customers             TABLE
-    HiveTest.SalesDB         Orders                TABLE
-    HiveTest.SalesDB         OrderLines            TABLE
-    HiveTest.SalesDB         USOrders              VIEW
-    dfs.default              CustomerSocialProfile VIEW
-    ----------------------------------------------------------------
-
-**Note:** Currently, Drill only supports querying Drill views; Hive views are 
not yet supported.
-
-You can run a similar query to identify columns in tables and the data types
-of those columns:
-
-    SELECT COLUMN_NAME, DATA_TYPE 
-    FROM INFORMATION_SCHEMA.COLUMNS 
-    WHERE TABLE_NAME = 'Orders' AND TABLE_SCHEMA = 'HiveTest.SalesDB' AND 
COLUMN_NAME LIKE '%Total';
-    +-------------+------------+
-    | COLUMN_NAME | DATA_TYPE  |
-    +-------------+------------+
-    | OrderTotal  | Decimal    |
-    +-------------+------------+
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/006-query-sys-tbl.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/006-query-sys-tbl.md 
b/_docs/drill-docs/query/006-query-sys-tbl.md
deleted file mode 100644
index 1cfd3bd..0000000
--- a/_docs/drill-docs/query/006-query-sys-tbl.md
+++ /dev/null
@@ -1,176 +0,0 @@
----
-title: "Querying System Tables"
-parent: "Query"
----
-Drill has a sys database that contains system tables. You can query the system
-tables for information about Drill, including Drill ports, the Drill version
-running on the system, and available Drill options. View the databases in
-Drill to identify the sys database, and then use the sys database to view
-system tables that you can query.
-
-## View Drill Databases
-
-Issue the `SHOW DATABASES` command to view Drill databases.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> show databases;
-    +-------------+
-    | SCHEMA_NAME |
-    +-------------+
-    | M7          |
-    | hive.default|
-    | dfs.default |
-    | dfs.root    |
-    | dfs.views   |
-    | dfs.tmp     |
-    | dfs.tpcds   |
-    | sys         |
-    | cp.default  |
-    | hbase       |
-    | INFORMATION_SCHEMA |
-    +-------------+
-    11 rows selected (0.162 seconds)
-
-Drill returns `sys` in the database results.
-
-## Use the Sys Database
-
-Issue the `USE` command to select the sys database for subsequent SQL
-requests.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> use sys;
-    +------------+--------------------------------+
-    |   ok     |  summary                         |
-    +------------+--------------------------------+
-    | true     | Default schema changed to 'sys'  |
-    +------------+--------------------------------+
-    1 row selected (0.101 seconds)
-
-## View Tables
-
-Issue the `SHOW TABLES` command to view the tables in the sys database.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> show tables;
-    +--------------+------------+
-    | TABLE_SCHEMA | TABLE_NAME |
-    +--------------+------------+
-    | sys          | drillbits  |
-    | sys          | version    |
-    | sys          | options    |
-    +--------------+------------+
-    3 rows selected (0.934 seconds)
-    0: jdbc:drill:zk=10.10.100.113:5181>
-
-## Query System Tables
-
-Query the drillbits, version, and options tables in the sys database.
-
-###### Query the drillbits table.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> select * from drillbits;
-    +------------------+------------+--------------+------------+---------+
-    |   host            | user_port | control_port | data_port  |  current|
-    +-------------------+------------+--------------+------------+--------+
-    | qa-node115.qa.lab | 31010     | 31011        | 31012      | true    |
-    | qa-node114.qa.lab | 31010     | 31011        | 31012      | false   |
-    | qa-node116.qa.lab | 31010     | 31011        | 31012      | false   |
-    +------------+------------+--------------+------------+---------------+
-    3 rows selected (0.146 seconds)
-
-  * host   
-The name of the node running the Drillbit service.
-
-  * user-port  
-The user port address, used between nodes in a cluster for connecting to
-external clients and for the Drill Web UI.  
-
-  * control_port  
-The control port address, used between nodes for multi-node installation of
-Apache Drill.
-
-  * data_port  
-The data port address, used between nodes for multi-node installation of
-Apache Drill.
-
-  * current  
-True means the Drillbit is connected to the session or client running the
-query. This Drillbit is the Foreman for the current session.  
-
-###### Query the version table.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> select * from version;
-    +------------+----------------+-------------+-------------+------------+
-    | commit_id  | commit_message | commit_time | build_email | build_time |
-    +------------+----------------+-------------+-------------+------------+
-    | 108d29fce3d8465d619d45db5f6f433ca3d97619 | DRILL-1635: Additional fix 
for validation exceptions. | 14.11.2014 @ 02:32:47 UTC | Unknown    | 
14.11.2014 @ 03:56:07 UTC |
-    +------------+----------------+-------------+-------------+------------+
-    1 row selected (0.144 seconds)
-
-  * commit_id  
-The github id of the release you are running. For example, <https://github.com
-/apache/drill/commit/e3ab2c1760ad34bda80141e2c3108f7eda7c9104>
-
-  * commit_message  
-The message explaining the change.
-
-  * commit_time  
-The date and time of the change.
-
-  * build_email  
-The email address of the person who made the change, which is unknown in this
-example.
-
-  * build_time  
-The time that the release was built.
-
-###### Query the options table.
-
-Drill provides system, session, and boot options that you can query.
-
-The following example shows a query on the system options:
-
-    0: jdbc:drill:zk=10.10.100.113:5181> select * from options where 
type='SYSTEM' limit 10;
-    
+------------+------------+------------+------------+------------+------------+------------+
-    |    name   |   kind    |   type    |  num_val   | string_val |  bool_val  
| float_val  |
-    
+------------+------------+------------+------------+------------+------------+------------+
-    | exec.max_hash_table_size | LONG       | SYSTEM    | 1073741824 | null    
 | null      | null      |
-    | planner.memory.max_query_memory_per_node | LONG       | SYSTEM    | 2048 
      | null     | null      | null      |
-    | planner.join.row_count_estimate_factor | DOUBLE   | SYSTEM    | null     
 | null      | null      | 1.0       |
-    | planner.affinity_factor | DOUBLE  | SYSTEM    | null      | null      | 
null       | 1.2      |
-    | exec.errors.verbose | BOOLEAN | SYSTEM    | null      | null      | 
false      | null     |
-    | planner.disable_exchanges | BOOLEAN   | SYSTEM    | null      | null     
 | false      | null     |
-    | exec.java_compiler_debug | BOOLEAN    | SYSTEM    | null      | null     
 | true      | null      |
-    | exec.min_hash_table_size | LONG       | SYSTEM    | 65536     | null     
 | null      | null       |
-    | exec.java_compiler_janino_maxsize | LONG       | SYSTEM   | 262144    | 
null      | null      | null      |
-    | planner.enable_mergejoin | BOOLEAN    | SYSTEM    | null      | null     
 | true      | null       |
-    
+------------+------------+------------+------------+------------+------------+------------+
-    10 rows selected (0.334 seconds)  
-
-  * name  
-The name of the option.
-
-  * kind  
-The data type of the option value.
-
-  * type  
-The type of options in the output: system, session, or boot.
-
-  * num_val  
-The default value, which is of the long or int data type; otherwise, null.
-
-  * string_val  
-The default value, which is a string; otherwise, null.
-
-  * bool_val  
-The default value, which is true or false; otherwise, null.
-
-  * float_val  
-The default value, which is of the double, float, or long double data type;
-otherwise, null.
-
-For information about how to configure Drill system and session options, see[
-Planning and Execution Options](https://cwiki.apache.org/confluence/display/DR
-ILL/Planning+and+Execution+Options).
-
-For information about how to configure Drill start-up options, see[ Start-Up
-Options](https://cwiki.apache.org/confluence/display/DRILL/Start-Up+Options).
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/007-interfaces.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/007-interfaces.md 
b/_docs/drill-docs/query/007-interfaces.md
deleted file mode 100644
index 5dc69c4..0000000
--- a/_docs/drill-docs/query/007-interfaces.md
+++ /dev/null
@@ -1,16 +0,0 @@
----
-title: "Drill Interfaces"
-parent: "Query"
----
-You can connect to Apache Drill through the following interfaces:
-
-  * Drill shell (SQLLine)
-  * Drill Web UI
-  * ODBC*
-  * 
[JDBC](/confluence/display/DRILL/Using+JDBC+to+Access+Apache+Drill+from+SQuirreL)
-  * C++ API
-
-*Apache Drill does not have an open source ODBC driver. However, MapR provides 
an ODBC driver that you can use to connect to Apache Drill from BI tools. For 
more information, refer to the following documents:
-
-  * [Using JDBC to Access Apache Drill from 
SQuirreL](/confluence/display/DRILL/Using+JDBC+to+Access+Apache+Drill+from+SQuirreL)
-  * [Using ODBC to Access Apache Drill from BI 
Tools](/confluence/display/DRILL/Using+ODBC+to+Access+Apache+Drill+from+BI+Tools)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/interfaces/001-jdbc.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/interfaces/001-jdbc.md 
b/_docs/drill-docs/query/interfaces/001-jdbc.md
deleted file mode 100644
index d2c5dd3..0000000
--- a/_docs/drill-docs/query/interfaces/001-jdbc.md
+++ /dev/null
@@ -1,138 +0,0 @@
----
-title: "Using JDBC to Access Apache Drill from SQuirreL"
-parent: "Drill Interfaces"
----
-You can connect to Drill through a JDBC client tool, such as SQuirreL, on
-Windows, Linux, and Mac OS X systems, to access all of your data sources
-registered with Drill. An embedded JDBC driver is included with Drill.
-Configure the JDBC driver in the SQuirreL client to connect to Drill from
-SQuirreL. This document provides instruction for connecting to Drill from
-SQuirreL on Windows.
-
-To use the Drill JDBC driver with SQuirreL on Windows, complete the following
-steps:
-
-  * Step 1: Getting the Drill JDBC Driver 
-  * Step 2: Installing and Starting SQuirreL
-  * Step 3: Adding the Drill JDBC Driver to SQuirreL
-  * Step 4: Running a Drill Query from SQuirreL
-
-For information about how to use SQuirreL, refer to the [SQuirreL Quick
-Start](http://squirrel-sql.sourceforge.net/user-manual/quick_start.html)
-guide.
-
-### Prerequisites
-
-  * SQuirreL requires JRE 7
-  * Drill installed in distributed mode on one or multiple nodes in a cluster. 
Refer to the [Install 
Drill](https://cwiki.apache.org/confluence/display/DRILL/Install+Drill) 
documentation for more information.
-  * The client must be able to resolve the actual hostname of the Drill 
node(s) with the IP(s). Verify that a DNS entry was created on the client 
machine for the Drill node(s).   
-If a DNS entry does not exist, create the entry for the Drill node(s).
-
-    * For Windows, create the entry in the %WINDIR%\system32\drivers\etc\hosts 
file.
-
-    * For Linux and Mac, create the entry in /etc/hosts.  
-<drill-machine-IP> <drill-machine-hostname>  
-Example: `127.0.1.1 maprdemo`
-
-## Step 1: Getting the Drill JDBC Driver
-
-The Drill JDBC Driver `JAR` file must exist in a directory on your Windows
-machine in order to configure the driver in the SQuirreL client.
-
-You can copy the Drill JDBC `JAR` file from the following Drill installation
-directory on the node with Drill installed, to a directory on your Windows
-machine:
-
-    
<drill_installation_directory>/jars/jdbc-driver/drill-jdbc-all-0.7.0-SNAPSHOT.jar
-
-Or, you can download the [apache-
-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0
-/apache-drill-0.7.0.tar.gz) file to a location on your Windows machine, and
-extract the contents of the file. You may need to use a decompression utility,
-such as [7-zip](http://www.7-zip.org/) to extract the archive. Once extracted,
-you can locate the driver in the following directory:
-
-    
<windows_directory>\apache-drill-<version>\jars\jdbc-driver\drill-jdbc-all-0.7.0-SNAPSHOT.jar
-
-## Step 2: Installing and Starting SQuirreL
-
-To install and start SQuirreL, complete the following steps:
-
-  1. Download the SQuirreL JAR file for Windows from the following location:  
-<http://www.squirrelsql.org/#installation>
-
-  2. Double-click the SQuirreL `JAR` file. The SQuirreL installation wizard 
walks you through the installation process.
-  3. When installation completes, navigate to the SQuirreL installation folder 
and then double-click `squirrel-sql.bat` to start SQuirreL.
-
-## Step 3: Adding the Drill JDBC Driver to SQuirreL
-
-To add the Drill JDBC Driver to SQuirreL, define the driver and create a
-database alias. The alias is a specific instance of the driver configuration.
-SQuirreL uses the driver definition and alias to connect to Drill so you can
-access data sources that you have registered with Drill.
-
-### A. Define the Driver
-
-To define the Drill JDBC Driver, complete the following steps:
-
-  1. In the SQuirreL toolbar, select **Drivers > New Driver**. The Add Driver 
dialog box appears.
-  
-  ![](../../../img/40.png)
-     
-  2. Enter the following information:
-
-     <table class="confluenceTable"><tbody><tr><td 
valign="top"><p><strong>Option</strong></p></td><td 
valign="top"><p><strong>Description</strong></p></td></tr><tr><td 
valign="top"><p>Name</p></td><td valign="top"><p>Name for the Drill JDBC 
Driver</p></td></tr><tr><td valign="top"><p>Example URL</p></td><td 
valign="top"><p><code>jdbc:drill:zk=&lt;<em>zookeeper_quorum</em>&gt;[;schema=&lt;<em>schema_to_use_as_default</em>&gt;]</code></p><p><strong>Example:</strong><code>
 jdbc:drill:zk=maprdemo:5181</code></p><p><strong>Note:</strong> The default 
ZooKeeper port is 2181. In a MapR cluster, the ZooKeeper port is 
5181.</p></td></tr><tr><td valign="top"><p>Website URL</p></td><td 
valign="top"><p><code>jdbc:drill:zk=&lt;<em>zookeeper_quorum</em>&gt;[;schema=&lt;<em>schema_to_use_as_default</em>&gt;]</code></p><p><strong>Example:</strong><code><code>
 jdbc:drill:zk=maprdemo:5181</code></code></p><p><strong>Note:</strong><span> 
The default ZooKeeper port is 2181. In a MapR cluster, the ZooKe
 eper port is 5181.</span></p></td></tr><tr><td valign="top"><p>Extra Class 
Path</p></td><td valign="top"><p>Click <strong>Add</strong> and navigate to the 
JDBC <code>JAR</code> file location in the Windows directory:<br 
/><code>&lt;windows_directory&gt;/jars/jdbc-driver/<span style="color: 
rgb(34,34,34);">drill-jdbc-all-0.6.0-</span><span style="color: 
rgb(34,34,34);">incubating.jar</span></code></p><p>Select the <code>JAR</code> 
file, click <strong>Open</strong>, and then click <strong>List 
Drivers</strong>.</p></td></tr><tr><td valign="top"><p>Class Name</p></td><td 
valign="top"><p>Select <code>org.apache.drill.jdbc.Driver</code> from the 
drop-down menu.</p></td></tr></tbody></table>  
-  
-  3. Click **OK**. The SQuirreL client displays a message stating that the 
driver registration is successful, and you can see the driver in the Drivers 
panel.  
-
-     ![](../../../img/52.png)
-
-### B. Create an Alias
-
-To create an alias, complete the following steps:
-
-  1. Select the **Aliases** tab.
-  2. In the SQuirreL toolbar, select **Aliases >****New Alias**. The Add Alias 
dialog box appears.
-    
-     ![](../../../img/19.png)
-
-  3. Enter the following information:
-  
-     <table class="confluenceTable"><tbody><tr><td 
valign="top"><p><strong>Option</strong></p></td><td 
valign="top"><p><strong>Description</strong></p></td></tr><tr><td 
valign="top"><p>Alias Name</p></td><td valign="top"><p>A unique name for the 
Drill JDBC Driver alias.</p></td></tr><tr><td 
valign="top"><p>Driver</p></td><td valign="top"><p>Select the Drill JDBC 
Driver.</p></td></tr><tr><td valign="top"><p>URL</p></td><td 
valign="top"><p>Enter the connection URL with <span>the name of the Drill 
directory stored in ZooKeeper and the cluster 
ID:</span></p><p><code>jdbc:drill:zk=&lt;<em>zookeeper_quorum</em>&gt;/&lt;drill_directory_in_zookeeper&gt;/&lt;cluster_ID&gt;;schema=&lt;<em>schema_to_use_as_default</em>&gt;</code></p><p><strong>The
 following examples show URLs for Drill installed on a single node:</strong><br 
/><span style="font-family: monospace;font-size: 14.0px;line-height: 
1.4285715;background-color: 
transparent;">jdbc:drill:zk=10.10.100.56:5181/drill/demo_mapr_com-drillbit
 s;schema=hive<br /></span><span style="font-family: monospace;font-size: 
14.0px;line-height: 1.4285715;background-color: 
transparent;">jdbc:drill:zk=10.10.100.24:2181/drill/drillbits1;schema=hive<br 
/> </span></p><div><strong>The following example shows a URL for Drill 
installed in distributed mode with a connection to a ZooKeeper 
quorum:</strong></div><div><span style="font-family: monospace;font-size: 
14.0px;line-height: 1.4285715;background-color: 
transparent;">jdbc:drill:zk=10.10.100.30:5181,10.10.100.31:5181,10.10.100.32:5181/drill/drillbits1;schema=hive</span></div>
    <div class="aui-message warning shadowed information-macro">
-                            <span class="aui-icon icon-warning"></span>
-                <div class="message-content">
-                            <ul><li style="list-style-type: 
none;background-image: none;"><ul><li>Including a default schema is 
optional.</li><li>The ZooKeeper port is 2181. In a MapR cluster, the ZooKeeper 
port is 5181.</li><li>The Drill directory stored in ZooKeeper is 
<code>/drill</code>. </li><li>The Drill default cluster ID is<code> 
drillbits1</code>.</li></ul></li></ul>
-                    </div>
-    </div>
-</td></tr><tr><td valign="top"><p>User Name</p></td><td 
valign="top"><p>admin</p></td></tr><tr><td valign="top"><p>Password</p></td><td 
valign="top"><p>admin</p></td></tr></tbody></table>
-
-  
-  4. Click **Ok. **The Connect to: dialog box appears.  
-
-     ![](../../../img/30.png?version=1&modificationDate=1410385290359&api=v2)
-
-  5. Click **Connect.** SQuirreL displays a message stating that the 
connection is successful.  
-![](../../../img/53.png?version=1&modificationDate=1410385313418&api=v2)
-
-  6. Click **OK**. SQuirreL displays a series of tabs.
-
-## Step 4: Running a Drill Query from SQuirreL
-
-Once you have SQuirreL successfully connected to your cluster through the
-Drill JDBC Driver, you can issue queries from the SQuirreL client. You can run
-a test query on some sample data included in the Drill installation to try out
-SQuirreL with Drill.
-
-To query sample data with Squirrel, complete the following steps:
-
-  1. Click the 
![](http://doc.mapr.com/download/attachments/26986731/image2014-9-10%2014%3A43%3A14.png?version=1&modificationDate=1410385394576&api=v2)
 tab.
-  2. Enter the following query in the query box:   
-``SELECT * FROM cp.`employee.json`;``  
-Example:  
- ![](../../../img/11.png?version=1&modificationDate=1410385451811&api=v2)
-
-  3. Press **Ctrl+Enter** to run the query. The following query results 
display:  
- ![](../../../img/42.png?version=1&modificationDate=1410385482574&api=v2)
-
-You have successfully run a Drill query from the SQuirreL client.
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/interfaces/002-odbc.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/interfaces/002-odbc.md 
b/_docs/drill-docs/query/interfaces/002-odbc.md
deleted file mode 100644
index 1bb82bb..0000000
--- a/_docs/drill-docs/query/interfaces/002-odbc.md
+++ /dev/null
@@ -1,23 +0,0 @@
----
-title: "Using ODBC to Access Apache Drill from BI Tools"
-parent: "Drill Interfaces"
----
-MapR provides ODBC drivers for Windows, Mac OS X, and Linux. It is recommended
-that you install the latest version of Apache Drill with the latest version of
-the Drill ODBC driver.
-
-For example, if you have Apache Drill 0.5 and a Drill ODBC driver installed on
-your machine, and then you upgrade to Apache Drill 0.6, do not assume that the
-Drill ODBC driver installed on your machine will work with the new version of
-Apache Drill. Install the latest available Drill ODBC driver to ensure that
-the two components work together.
-
-You can access the latest Drill ODBC drivers in the following location:
-
-`<http://package.mapr.com/tools/MapR-ODBC/MapR_Drill/MapRDrill_odbc/>`
-
-Refer to the following documents for driver installation and configuration
-information, as well as examples for connecting to BI tools:
-
-  * [Using the MapR ODBC Driver on 
Windows](/confluence/display/DRILL/Using+the+MapR+ODBC+Driver+on+Windows)
-  * [Using the MapR Drill ODBC Driver on Linux and Mac OS 
X](/confluence/display/DRILL/Using+the+MapR+Drill+ODBC+Driver+on+Linux+and+Mac+OS+X)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-complex/001-sample-donuts.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/query-complex/001-sample-donuts.md 
b/_docs/drill-docs/query/query-complex/001-sample-donuts.md
deleted file mode 100644
index 37010ec..0000000
--- a/_docs/drill-docs/query/query-complex/001-sample-donuts.md
+++ /dev/null
@@ -1,40 +0,0 @@
----
-title: "Sample Data: Donuts"
-parent: "Query Complex Data"
----
-The complex data queries use sample `donuts.json` and `moredonuts.json` files.
-Here is the single complete "record" (`0001`) from the `donuts.json `file. In
-terms of Drill query processing, this record is equivalent to a single record
-in a table.
-
-    {
-      "id": "0001",
-      "type": "donut",
-      "name": "Cake",
-      "ppu": 0.55,
-      "batters":
-        {
-          "batter":
-            [
-               { "id": "1001", "type": "Regular" },
-               { "id": "1002", "type": "Chocolate" },
-               { "id": "1003", "type": "Blueberry" },
-               { "id": "1004", "type": "Devil's Food" }
-             ]
-        },
-      "topping":
-        [
-           { "id": "5001", "type": "None" },
-           { "id": "5002", "type": "Glazed" },
-           { "id": "5005", "type": "Sugar" },
-           { "id": "5007", "type": "Powdered Sugar" },
-           { "id": "5006", "type": "Chocolate with Sprinkles" },
-           { "id": "5003", "type": "Chocolate" },
-           { "id": "5004", "type": "Maple" }
-         ]
-    }
-
-The data is made up of maps, arrays, and nested arrays. Name-value pairs and
-embedded name-value pairs define the contents of each record. For example,
-`type: donut` is a map. Under `topping`, the pairs of `id` and `type` values
-belong to an array (inside the square brackets).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-complex/002-query1-select.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/query-complex/002-query1-select.md 
b/_docs/drill-docs/query/query-complex/002-query1-select.md
deleted file mode 100644
index a5fe5ea..0000000
--- a/_docs/drill-docs/query/query-complex/002-query1-select.md
+++ /dev/null
@@ -1,19 +0,0 @@
----
-title: "Query 1: Selecting Flat Data"
-parent: "Query Complex Data"
----
-A very simple query against the `donuts.json` file returns the values for the
-four "flat" columns (the columns that contain data at the top level only: no
-nested data):
-
-    0: jdbc:drill:zk=local> select id, type, name, ppu
-    from dfs.`/Users/brumsby/drill/donuts.json`;
-    +------------+------------+------------+------------+
-    |     id     |    type    |    name    |    ppu     |
-    +------------+------------+------------+------------+
-    | 0001       | donut      | Cake       | 0.55       |
-    +------------+------------+------------+------------+
-    1 row selected (0.248 seconds)
-
-Note that `dfs` is the schema name, the path to the file is enclosed by
-backticks, and the query must end with a semicolon.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-complex/003-query2-use-sql.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/query-complex/003-query2-use-sql.md 
b/_docs/drill-docs/query/query-complex/003-query2-use-sql.md
deleted file mode 100644
index cf614ad..0000000
--- a/_docs/drill-docs/query/query-complex/003-query2-use-sql.md
+++ /dev/null
@@ -1,74 +0,0 @@
----
-title: "Query 2: Using Standard SQL Functions, Clauses, and Joins"
-parent: "Query Complex Data"
----
-You can use standard SQL clauses, such as WHERE and ORDER BY, to elaborate on
-this kind of simple query:
-
-    0: jdbc:drill:zk=local> select id, type from 
dfs.`/Users/brumsby/drill/donuts.json`
-    where id>0
-    order by id limit 1;
-  
-    +------------+------------+
-  
-    |     id     |    type    |
-  
-    +------------+------------+
-  
-    | 0001       | donut      |
-  
-    +------------+------------+
-  
-    1 row selected (0.318 seconds)
-
-You can also join files (or tables, or files and tables) by using standard
-syntax:
-
-    0: jdbc:drill:zk=local> select tbl1.id, tbl1.type from 
dfs.`/Users/brumsby/drill/donuts.json` as tbl1
-    join
-    dfs.`/Users/brumsby/drill/moredonuts.json` as tbl2
-    on tbl1.id=tbl2.id;
-  
-    +------------+------------+
-  
-    |     id     |    type    |
-  
-    +------------+------------+
-  
-    | 0001       | donut      |
-  
-    +------------+------------+
-  
-    1 row selected (0.395 seconds)
-
-Equivalent USING syntax and joins in the WHERE clause are also supported.
-
-Standard aggregate functions work against JSON data. For example:
-
-    0: jdbc:drill:zk=local> select type, avg(ppu) as ppu_sum from 
dfs.`/Users/brumsby/drill/donuts.json` group by type;
-  
-    +------------+------------+
-  
-    |    type    |  ppu_sum   |
-  
-    +------------+------------+
-  
-    | donut      | 0.55       |
-  
-    +------------+------------+
-  
-    1 row selected (0.216 seconds)
-  
-    0: jdbc:drill:zk=local> select type, sum(sales) as sum_by_type from 
dfs.`/Users/brumsby/drill/moredonuts.json` group by type;
-  
-    +------------+-------------+
-  
-    |    type    | sum_by_type |
-  
-    +------------+-------------+
-  
-    | donut      | 1194        |
-  
-    +------------+-------------+
-  
-    1 row selected (0.389 seconds)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-complex/004-query3-sel-nest.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/query-complex/004-query3-sel-nest.md 
b/_docs/drill-docs/query/query-complex/004-query3-sel-nest.md
deleted file mode 100644
index 2d279d1..0000000
--- a/_docs/drill-docs/query/query-complex/004-query3-sel-nest.md
+++ /dev/null
@@ -1,50 +0,0 @@
----
-title: "Query 3: Selecting Nested Data for a Column"
-parent: "Query Complex Data"
----
-The following queries show how to access the nested data inside the parts of
-the record that are not flat (such as `topping`). To isolate and return nested
-data, use the `[n]` notation, where `n` is a number that points to a specific
-position in an array. Arrays use a 0-based index, so `topping[3]` points to
-the _fourth_ element in the array under `topping`, not the third.
-
-    0: jdbc:drill:zk=local> select topping[3] as top from 
dfs.`/Users/brumsby/drill/donuts.json`;
-  
-    +------------+
-  
-    |    top     |
-  
-    +------------+
-  
-    | {"id":"5007","type":"Powdered Sugar"} |
-  
-    +------------+
-  
-    1 row selected (0.137 seconds)
-
-Note that this query produces _one column for all of the data_ that is nested
-inside the `topping` segment of the file. The query as written does not unpack
-the `id` and `type` name/value pairs. Also note the use of an alias for the
-column name. (Without the alias, the default column name would be `EXPR$0`.)
-
-Some JSON files store arrays within arrays. If your data has this
-characteristic, you can probe into the inner array by using the following
-notation: `[n][n]`
-
-For example, assume that a segment of the JSON file looks like this:
-
-    ...
-    group:
-    [
-      [1,2,3],
-  
-      [4,5,6],
-  
-      [7,8,9]
-    ]
-    ...
-
-The following query would return `6` (the _third_ value of the _second_ inner
-array).
-
-`select group[1][2]`
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/query/query-complex/005-query4-sel-multiple.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/query/query-complex/005-query4-sel-multiple.md 
b/_docs/drill-docs/query/query-complex/005-query4-sel-multiple.md
deleted file mode 100644
index 832094e..0000000
--- a/_docs/drill-docs/query/query-complex/005-query4-sel-multiple.md
+++ /dev/null
@@ -1,24 +0,0 @@
----
-title: "Query 4: Selecting Multiple Columns Within Nested Data"
-parent: "Query Complex Data"
----
-The following query goes one step further to extract the JSON data, selecting
-specific `id` and `type` data values _as individual columns_ from inside the
-`topping` array. This query is similar to the previous query, but it returns
-the `id` and `type` values as separate columns.
-
-    0: jdbc:drill:zk=local> select tbl.topping[3].id as record, 
tbl.topping[3].type as first_topping
-    from dfs.`/Users/brumsby/drill/donuts.json` as tbl;
-    +------------+---------------+
-    |   record   | first_topping |
-    +------------+---------------+
-    | 5007       | Powdered Sugar |
-    +------------+---------------+
-    1 row selected (0.133 seconds)
-
-This query also introduces a typical requirement for queries against nested
-data: the use of a table alias (named tbl in this example). Without the table
-alias, the query would return an error because the parser would assume that id
-is a column inside a table named topping. As in all standard SQL queries,
-select tbl.col means that tbl is the name of an existing table (at least for
-the duration of the query) and col is a column that exists in that table.
\ No newline at end of file

[08/13] drill git commit: DRILL-2315: Confluence conversion plus fixes

Reply via email to