Repository: drill-site Updated Branches: refs/heads/asf-site 03410ebc9 -> f3a38dbdd
Website update Project: http://git-wip-us.apache.org/repos/asf/drill-site/repo Commit: http://git-wip-us.apache.org/repos/asf/drill-site/commit/f3a38dbd Tree: http://git-wip-us.apache.org/repos/asf/drill-site/tree/f3a38dbd Diff: http://git-wip-us.apache.org/repos/asf/drill-site/diff/f3a38dbd Branch: refs/heads/asf-site Commit: f3a38dbdd3e8921d2996fcaba1d6f32caea576e0 Parents: 03410eb Author: Kris Hahn <[email protected]> Authored: Wed Jan 6 16:42:27 2016 -0800 Committer: Kris Hahn <[email protected]> Committed: Wed Jan 6 16:42:27 2016 -0800 ---------------------------------------------------------------------- README.md | 22 +++++++++- docs/aol-search/index.html | 57 ++++--------------------- docs/hive-storage-plugin/index.html | 71 ++++++++++++++++++++++---------- feed.xml | 4 +- 4 files changed, 80 insertions(+), 74 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/drill-site/blob/f3a38dbd/README.md ---------------------------------------------------------------------- diff --git a/README.md b/README.md index 49df1fa..2d3f0b7 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,8 @@ jekyll serve --config _config.yml,_config-prod.yml ``` Note that you can skip the first two commands (and only run `jekyll serve`) if you haven't changed the title or path of any of the documentation pages. +## One Time Setup for Last-Modified-Date + To automatically add the last-modified-on date, a one-time local setup is required: 1. In your cloned directory of Drill, in drill/.git/hooks, create a file named pre-commit (no extension) that contains this script: @@ -28,7 +30,7 @@ To automatically add the last-modified-on date, a one-time local setup is requir chmod +x pre-commit -In addition to the title: and parent:, you now need to add date: to the front matter of any file you create. For example: +On any page you create, in addition to the title: and parent:, you now need to add date: to the front matter of any file you create. For example: --- title: "Configuring Multitenant Resources" @@ -36,7 +38,23 @@ In addition to the title: and parent:, you now need to add date: to the front ma date: --- -Do not fill in or alter the date: field. Jekyll and git take care of that when you commit the file. +Do not fill in or alter the date: field. Jekyll and git take care of that when you commit the file. + +## One Time Setup for Redirecting gh-pages + +Locally install the jekyll-redirect-from gem: + + gem install jekyll-redirect-from + +On any page you want to redirect, add the redirect_to: and the URL to the front matter. For example: + + --- + title: "Configuring Multitenant Resources" + parent: "Configuring a Multitenant Cluster" + date: + redirect_to: + - http://<new_url> + --- # Compiling the Website http://git-wip-us.apache.org/repos/asf/drill-site/blob/f3a38dbd/docs/aol-search/index.html ---------------------------------------------------------------------- diff --git a/docs/aol-search/index.html b/docs/aol-search/index.html index 4981dfd..2200e16 100644 --- a/docs/aol-search/index.html +++ b/docs/aol-search/index.html @@ -1038,59 +1038,20 @@ </div> - + Jan 6, 2016 <link href="/css/docpage.css" rel="stylesheet" type="text/css"> <div class="int_text" align="left"> - <h2 id="quick-stats">Quick Stats</h2> - -<p>The <a href="http://en.wikipedia.org/wiki/AOL_search_data_leak">AOL Search dataset</a> is -a collection of real query log data that is based on real users.</p> - -<h2 id="the-data-source">The Data Source</h2> - -<p>The dataset consists of 20M Web queries from 650k users over a period of three -months, 440MB in total and available <a href="http://zola.di.unipi.it/smalltext/datasets.html">for -download</a>. The format used in -the dataset is:</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">AnonID, Query, QueryTime, ItemRank, ClickURL -</code></pre></div> -<p>... with:</p> - -<ul> -<li>AnonID, an anonymous user ID number.</li> -<li>Query, the query issued by the user, case shifted with most punctuation removed.</li> -<li>QueryTime, the time at which the query was submitted for search.</li> -<li>ItemRank, if the user clicked on a search result, the rank of the item on which they clicked is listed.</li> -<li><a href="http://www.dietkart.com/">ClickURL</a>, if the user clicked on a search result, the domain portion of the URL in the clicked result is listed.</li> -</ul> - -<p>Each line in the data represents one of two types of events</p> - -<ul> -<li>A query that was NOT followed by the user clicking on a result item.</li> -<li>A click through on an item in the result list returned from a query.</li> -</ul> - -<p>In the first case (query only) there is data in only the first three columns, -in the second case (click through), there is data in all five columns. For -click through events, the query that preceded the click through is included. -Note that if a user clicked on more than one result in the list returned from -a single query, there will be TWO lines in the data to represent the two -events.</p> - -<h2 id="the-queries">The Queries</h2> - -<p>Interesting queries, for example</p> - -<ul> -<li>Users querying for topic X</li> -<li>Users that click on the first (second, third) ranked item</li> -<li>TOP 10 domains searched</li> -<li>TOP 10 domains clicked at</li> -</ul> + <p><!DOCTYPE html> +<meta charset="utf-8"> +<title>Redirecting...</title> +<link rel="canonical" href="http://gregsadetsky.com/aol-data"> +<meta http-equiv="refresh" content="0; url=http://gregsadetsky.com/aol-data"> +<h1>Redirecting...</h1> +<a href="http://gregsadetsky.com/aol-data">Click here if you are not redirected.</a> +<script>location="<a href="http://gregsadetsky.com/aol-data">http://gregsadetsky.com/aol-data</a>"</script></p> http://git-wip-us.apache.org/repos/asf/drill-site/blob/f3a38dbd/docs/hive-storage-plugin/index.html ---------------------------------------------------------------------- diff --git a/docs/hive-storage-plugin/index.html b/docs/hive-storage-plugin/index.html index 3f37bdb..2f83c52 100644 --- a/docs/hive-storage-plugin/index.html +++ b/docs/hive-storage-plugin/index.html @@ -1038,7 +1038,7 @@ </div> - + Jan 6, 2016 <link href="/css/docpage.css" rel="stylesheet" type="text/css"> @@ -1049,38 +1049,50 @@ using custom SerDes or InputFormat/OutputFormat, all nodes running Drillbits must have the SerDes or InputFormat/OutputFormat <code>JAR</code> files in the <code><drill_installation_directory>/jars/3rdparty</code> folder.</p> -<h2 id="hive-remote-metastore-configuration">Hive Remote Metastore Configuration</h2> +<p>You can run Hive queries in the following ways by configuring the Hive storage plugin as described in this document:</p> + +<ul> +<li><a href="/docs/hive-storage-plugin/#connect-drill-to-the-hive-remote-metastore-directly">Connect Drill to the Hive remote metastore</a><br></li> +<li><a href="/docs/hive-storage-plugin/#connect-to-the-hive-embedded-metastore">Connect to the Hive embedded metastore</a><br></li> +</ul> + +<p>You update the Hive storage plugin by selecting the <strong>Storage tab</strong> on the <a href="/docs/plugin-configuration-basics/#using-the-drill-web-console">Drill Web Console</a>. From the list of disabled storage plugins in the Drill Web Console, click <strong>Update</strong> next to <code>hive</code>. The default Hive storage plugin configuration appears as follows:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text"> { + "type": "hive", + "enabled": false, + "configProps": { + "hive.metastore.uris": "", + "javax.jdo.option.ConnectionURL": "jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true", + "hive.metastore.warehouse.dir": "/tmp/drill_hive_wh", + "fs.default.name": "file:///", + "hive.metastore.sasl.enabled": "false" + } + } +</code></pre></div> +<h2 id="connect-drill-to-the-hive-remote-metastore">Connect Drill to the Hive Remote Metastore</h2> <p>The Hive metastore runs as a separate service outside -of Hive. Drill communicates with the Hive metastore through Thrift. The -metastore service communicates with the Hive database over JDBC. Point Drill -to the Hive metastore service address, and provide the connection parameters -in a Hive storage plugin configuration to configure a connection to Drill.</p> +of Hive. Drill can query the Hive metastore through Thrift. The +metastore service communicates with the Hive database over JDBC. </p> + +<p>Follow the steps in the next section to point Drill +to the Hive metastore service address. Provide the connection parameters +in a Hive storage plugin configuration to configure a connection to Drill. At this point, if you query data sources that Drill supports other than HBase (or MapR), you are finished configuring the Hive storage plugin. If you query HBase using Hive, you need to add ZooKeeper quorum and port properties. The HBaseStorageHandler requires these properties. Drill discovers HBase services using these properties. If you use the HBase storage plugin, the ZooKeeper quorum and port properties in the Hive storage plugin are the same as those in the HBase storage plugin, assuming you want to use the same HBase database. </p> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">Verify that the Hive metastore service is running before you register the Hive metastore. </p> </div> -<p>To register a remote Hive metastore with Drill:</p> +<h3 id="hive-remote-metastore-configuration">Hive Remote Metastore Configuration</h3> + +<p>To connect Drill to a remote Hive metastore:</p> <ol> -<li>Issue the following command to start the Hive metastore service on the system specified in the <code>hive.metastore.uris</code>: +<li>Issue the following command to start the Hive metastore service on the system specified in the <code>hive.metastore.uris</code>:<br> <code>hive --service metastore</code></li> <li>In the <a href="/docs/plugin-configuration-basics/#using-the-drill-web-console">Drill Web Console</a>, select the <strong>Storage</strong> tab.</li> -<li><p>In the list of disabled storage plugins in the Drill Web Console, click <strong>Update</strong> next to <code>hive</code>. The Hive storage plugin configuration appears:</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">{ - "type": "hive", - "enabled": false, - "configProps": { - "hive.metastore.uris": "", - "javax.jdo.option.ConnectionURL": "jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true", - "hive.metastore.warehouse.dir": "/tmp/drill_hive_wh", - "fs.default.name": "file:///", - "hive.metastore.sasl.enabled": "false" - } -} -</code></pre></div></li> +<li>In the list of disabled storage plugins in the Drill Web Console, click <strong>Update</strong> next to <code>hive</code>.<br></li> <li><p>In the configuration window, add the <code>Thrift URI</code> and port to <code>hive.metastore.uris</code>. For example:</p> <div class="highlight"><pre><code class="language-text" data-lang="text"> ... "configProps": { @@ -1098,16 +1110,31 @@ in a Hive storage plugin configuration to configure a connection to Drill.</p> } } </code></pre></div></li> +<li><p>If you do not query HBase, skip this step. If you query HBase, in the configuration window, add the names of the ZooKeeper quorum hosts and the ZooKeeper port, for example 2181. </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">{ + "type": "hive", + "enabled": false, + "configProps": { + . + . + . + "hbase.zookeeper.quorum": "zkhost1,zkhost2,zkhost3", + "hbase.zookeeper.property.clientPort:" "2181" + } +} +</code></pre></div></li> <li><p>Click <strong>Enable</strong>. </p></li> </ol> -<h2 id="hive-embedded-metastore-configuration">Hive Embedded Metastore Configuration</h2> +<h2 id="connect-to-the-hive-embedded-metastore">Connect to the Hive embedded metastore</h2> <p>The Hive metastore configuration is embedded within the Drill process. Configure an embedded metastore only in a cluster that runs a single Drillbit and only for testing purposes. Do not embed the Hive metastore in production systems.</p> <p>Provide the metastore database configuration settings in the Drill Web Console. Before you configure an embedded Hive metastore, verify that the driver you use to connect to the Hive metastore is in the Drill classpath located in <code>/<drill installation directory>/lib/.</code> If the driver is not there, copy the driver to <code>/<drill installation directory>/lib</code> on the Drill node. For more information about storage types and configurations, refer to <a href="https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin">"Hive Metastore Administration"</a>.</p> +<h3 id="hive-embedded-metastore-configuration">Hive Embedded Metastore Configuration</h3> + <p>To configure an embedded Hive metastore, complete the following steps:</p> http://git-wip-us.apache.org/repos/asf/drill-site/blob/f3a38dbd/feed.xml ---------------------------------------------------------------------- diff --git a/feed.xml b/feed.xml index 6dda078..5082900 100644 --- a/feed.xml +++ b/feed.xml @@ -6,8 +6,8 @@ </description> <link>/</link> <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/> - <pubDate>Tue, 05 Jan 2016 15:21:34 -0800</pubDate> - <lastBuildDate>Tue, 05 Jan 2016 15:21:34 -0800</lastBuildDate> + <pubDate>Wed, 06 Jan 2016 16:37:25 -0800</pubDate> + <lastBuildDate>Wed, 06 Jan 2016 16:37:25 -0800</lastBuildDate> <generator>Jekyll v2.5.3</generator> <item>
