This is an automated email from the ASF dual-hosted git repository. bridgetb pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/drill-site.git
The following commit(s) were added to refs/heads/asf-site by this push: new 5366c79 doc edits 5366c79 is described below commit 5366c79e7a8167405c91bc8d50b2b156ef618a84 Author: Bridget Bevens <bbev...@maprtech.com> AuthorDate: Mon Jul 16 18:28:21 2018 -0700 doc edits --- docs/kafka-storage-plugin/index.html | 44 +++++++++++++-------------------- docs/running-drill-on-docker/index.html | 10 ++++---- feed.xml | 4 +-- 3 files changed, 24 insertions(+), 34 deletions(-) diff --git a/docs/kafka-storage-plugin/index.html b/docs/kafka-storage-plugin/index.html index 30014da..90b10ad 100644 --- a/docs/kafka-storage-plugin/index.html +++ b/docs/kafka-storage-plugin/index.html @@ -1313,37 +1313,27 @@ </code></pre></div> <h2 id="filter-pushdown-support">Filter Pushdown Support</h2> -<p>Prior to Drill 1.14, Drill scanned all of the data in a topic before applying filters. Starting in Drill 1.14, the Drill kafka storage plugin supports filter pushdown for query conditions on the following Kafka metadata fields in messages: </p> +<p>Pushing down filters to the Kafka data source reduces the number of messages that Drill scans and significantly decrease query time. Prior to Drill 1.14, Drill scanned all of the data in a topic before applying filters. Starting in Drill 1.14, Drill transforms the filter conditions on metadata fields to limit the range of offsets scanned from the topic. </p> + +<p>The Drill kafka storage plugin supports filter pushdown for query conditions on the following Kafka metadata fields in messages: </p> <ul> -<li>kafkaPartitionId<br></li> -<li>kafkaMsgOffset<br></li> -<li>kafkaMsgTimestamp<br></li> +<li><p><strong>kafkaPartitionId</strong><br> +Conditions on the kafkaPartitionId metadata field limit the number of partitions that Drill scans, which is useful for data exploration. +Drill can push down filters when a query contains the following conditions on the kafkaPartitionId metadata field:<br> +=, >, >=, <, <=</p></li> +<li><p><strong>kafkaMsgOffset</strong><br> +Drill can push down filters when a query contains the following conditions on the kafkaMsgOffset metadata field:<br> +=, >, >=, <, <= </p></li> +<li><p><strong>kafkaMsgTimestamp</strong><br> +The kafkaMsgTimestamp field maps to the timestamp stored for each Kafka message. Drill can push down filters when a query contains the following conditions on the kafkaMsgTimestamp metadata field:<br> +=, >, >= </p></li> </ul> -<p>Pushing down filters to the Kafka data source reduces the number of messages that Drill scans and significantly decrease query time. Drill transforms the filter conditions on the metadata fields to limit the range of offsets scanned from the topic. </p> - -<p>The following table describes filter pushdown support for Kafka metadata message fields: </p> - -<table><thead> -<tr> -<th>Metadata Field</th> -<th>Description</th> -</tr> -</thead><tbody> -<tr> -<td>kafkaPartitionId</td> -<td>Conditions on the kafkaPartitionId metadata field limit the number of partitions that Drill scans, which is useful for data exploration. Drill can push down filters when a query contains these conditions on the kafkaPartitionId metadata field: =, >, >=, <, <=</td> -</tr> -<tr> -<td>kafkaMsgOffset</td> -<td>Drill can push down filters when a query contains these conditions on the kafkaMsgOffset metadata field: =, >, >=, <, <=</td> -</tr> -<tr> -<td>kafkaMsgTimestamp</td> -<td>The kafkaMsgTimestamp field maps to the timestamp stored for each Kafka message. Drill can push down filters when a query contains these conditions on the kafkaMsgTimestamp metadata field: =, >, >= Kafka exposes the following <a href="https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/MockConsumer.html">Consumer API</a> to obtain the earliest offset for a given timestamp value: <code>public java.util.Map<TopicPartition,OffsetAndTimestamp& [...] -</tr> -</tbody></table> +<p>Kafka exposes the following Consumer API to obtain the earliest offset for a given timestamp value: </p> +<div class="highlight"><pre><code class="language-text" data-lang="text"> public java.util.Map<TopicPartition,OffsetAndTimestamp> offsetsForTimes(java.util.Map<TopicPartition,java.lang.Long> timestampsToSearch) +</code></pre></div> +<p>This API is used to determine the startOffset for each partition in a topic. Note that the timestamps may not appear in increasing order when reading from a Kafka topic if you have defined the timestamp for a message. However, the API returns the first offset (from the beginning of a topic partition) where the timestamp is greater or equal to the timestamp requested. Therefore, Drill does not support pushdown on < or <= because an earlier timestamp may exist beyond endOffsetcomp [...] <h2 id="enabling-and-configuring-the-kafka-storage-plugin">Enabling and Configuring the Kafka Storage Plugin</h2> diff --git a/docs/running-drill-on-docker/index.html b/docs/running-drill-on-docker/index.html index dab4714..061b949 100644 --- a/docs/running-drill-on-docker/index.html +++ b/docs/running-drill-on-docker/index.html @@ -1240,7 +1240,7 @@ </div> - Mar 18, 2018 + Jul 17, 2018 <link href="/css/docpage.css" rel="stylesheet" type="text/css"> @@ -1284,19 +1284,19 @@ </tr> <tr> <td><code>--name</code></td> -<td>Identifies the container. If you do not use this option to identify a name for the container, the daemon generates a random string name for you. When you use this option to identify a container name, you can use the name to reference the container within a Docker network in foreground or detached mode.</td> +<td>Identifies the container. If you do not use this option to identify a name for the container, the daemon generates a container ID for you. When you use this option to identify a container name, you can use the name to reference the container within a Docker network in foreground or detached mode.</td> </tr> <tr> -<td>-p</td> +<td><code>-p</code></td> <td>The TCP port for the Drill Web UI. If needed, you can change this port using the <code>drill.exec.http.port</code> <a href="/docs/start-up-options/">start-up option</a>.</td> </tr> <tr> <td><code>drill/apache-drill:<version></code></td> -<td>The GitHub repository and tag. In the following example, <code>drill/apache-drill</code> is the repository and <code>1.14.0</code> is the tag: <code>drill/apache-drill:1.14.0</code> The tag correlates with the version of Drill. When a new version of Drill is available, you can use the new version as the tag.</td> +<td>The Docker Hub repository and tag. In the following example, <code>drill/apache-drill</code> is the repository and <code>1.14.0</code> is the tag: <code>drill/apache-drill:1.14.0</code> The tag correlates with the version of Drill. When a new version of Drill is available, you can use the new version as the tag.</td> </tr> <tr> <td><code>bin/bash</code></td> -<td>Runs the Drill image downloaded from the <code>apache-drill</code> repository.</td> +<td>Connects to the Drill container using a bash shell.</td> </tr> </tbody></table> diff --git a/feed.xml b/feed.xml index a1a2750..f0332b2 100644 --- a/feed.xml +++ b/feed.xml @@ -6,8 +6,8 @@ </description> <link>/</link> <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/> - <pubDate>Mon, 16 Jul 2018 17:52:29 -0700</pubDate> - <lastBuildDate>Mon, 16 Jul 2018 17:52:29 -0700</lastBuildDate> + <pubDate>Mon, 16 Jul 2018 18:26:36 -0700</pubDate> + <lastBuildDate>Mon, 16 Jul 2018 18:26:36 -0700</lastBuildDate> <generator>Jekyll v2.5.2</generator> <item>