functions-for-ha...

bridgetb Mon, 13 Apr 2015 14:53:27 -0700

Modified: 
drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html
URL: 
http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html?rev=1673293&r1=1673292&r2=1673293&view=diff
==============================================================================
--- 
drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html 
(original)
+++ 
drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html 
Mon Apr 13 21:52:51 2015
@@ -85,19 +85,59 @@ planning and execution options:</p>
 you set the options at the session level unless you want the setting to
 persist across all sessions.</p>
 
-<p>The following table contains planning and execution options that you can set
-at the system or session level:</p>
+<p>The summary of system options lists default values. The following 
descriptions provide more detail on some of these options:</p>
 
-<table ><tbody><tr><th >Option name</th><th >Default value</th><th 
>Description</th></tr><tr><td valign="top" colspan="1" 
>exec.errors.verbose</td><td valign="top" colspan="1" ><p>false</p></td><td 
valign="top" colspan="1" ><p>This option enables or disables the verbose 
message that Drill returns when a query fails. When enabled, Drill provides 
additional information about failed queries.</p></td></tr><tr><td valign="top" 
colspan="1" ><span>exec.max_hash_table_size</span></td><td valign="top" 
colspan="1" >1073741824</td><td valign="top" colspan="1" ><span>The default 
maximum size for hash tables.</span></td></tr><tr><td valign="top" colspan="1" 
>exec.min_hash_table_size</td><td valign="top" colspan="1" >65536</td><td 
valign="top" colspan="1" >The default starting size for hash tables. Increasing 
this size is useful for very large aggregations or joins when you have large 
amounts of memory for Drill to use. Drill can spend a lot of time resizing the 
hash table as it finds new data. I
 f you have large data sets, you can increase this hash table size to increase 
performance.</td></tr><tr><td valign="top" colspan="1" 
>planner.add_producer_consumer</td><td valign="top" colspan="1" 
><p>false</p><p> </p></td><td valign="top" colspan="1" ><p>This option enables 
or disables a secondary reading thread that works out of band of the rest of 
the scanning fragment to prefetch data from disk. <span style="line-height: 
1.4285715;background-color: transparent;">If you interact with a certain type 
of storage medium that is slow or does not prefetch much data, this option 
tells Drill to add a producer consumer reading thread to the operation. Drill 
can then assign one thread that focuses on a single reading fragment. 
</span></p><p>If Drill is using memory, you can disable this option to get 
better performance. If Drill is using disk space, you should enable this option 
and set a reasonable queue size for the planner.producer_consumer_queue_size 
option.</p></td></tr><tr><td valign
 ="top" colspan="1" >planner.broadcast_threshold</td><td valign="top" 
colspan="1" >1000000</td><td valign="top" colspan="1" ><span style="color: 
rgb(34,34,34);">Threshold, in terms of a number of rows, that determines 
whether a broadcast join is chosen for a query. Regardless of the setting of 
the broadcast_join option (enabled or disabled), a broadcast join is not chosen 
unless the right side of the join is estimated to contain fewer rows than this 
threshold. The intent of this option is to avoid broadcasting too many rows for 
join purposes. Broadcasting involves sending data across nodes and is a 
network-intensive operation. (The &quot;right side&quot; of the join, which may 
itself be a join or simply a table, is determined by cost-based optimizations 
and heuristics during physical planning.)</span></td></tr><tr><td valign="top" 
colspan="1" ><p>planner.enable_broadcast_join<br />planner.enable_hashagg<br 
/>planner.enable_hashjoin<br />planner.enable_mergejoin<br />planner.enable_mu
 ltiphase_agg<br />planner.enable_streamagg</p></td><td valign="top" 
colspan="1" >true</td><td valign="top" colspan="1" ><p>These options enable or 
disable specific aggregation and join operators for queries. These operators 
are all enabled by default and in general should not be disabled.</p><p>Hash 
aggregation and hash join are hash-based operations. Streaming aggregation and 
merge join are sort-based operations. Both hash-based and sort-based operations 
consume memory; however, currently, hash-based operations do not spill to disk 
as needed, but the sort-based operations do. If large hash operations do not 
fit in memory on your system, you may need to disable these operations. Queries 
will continue to run, using alternative plans.</p></td></tr><tr><td 
valign="top" colspan="1" >planner.producer_consumer_queue_size</td><td 
valign="top" colspan="1" >10</td><td valign="top" colspan="1" >Determines how 
much data to prefetch from disk (in record batches) out of band of query 
execution. 
 The larger the queue size, the greater the amount of memory that the queue and 
overall query execution consumes.</td></tr><tr><td valign="top" colspan="1" 
>planner.slice_target</td><td valign="top" colspan="1" >100000</td><td 
valign="top" colspan="1" >The number of records manipulated within a fragment 
before Drill parallelizes them.</td></tr><tr><td valign="top" colspan="1" 
><p>planner.width.max_per_node</p><p> </p></td><td valign="top" colspan="1" 
><p>The default depends on the number of cores on each node.</p></td><td 
valign="top" colspan="1" ><p>In this context &quot;width&quot; refers to fanout 
or distribution potential: the ability to run a query in parallel across the 
cores on a node and the nodes on a cluster.</p><p><span>A physical plan 
consists of intermediate operations, known as query &quot;fragments,&quot; that 
run concurrently, yielding opportunities for parallelism above and below each 
exchange operator in the plan. An exchange operator represents a breakpoint in 
the 
 execution flow where processing can be distributed. For example, a 
single-process scan of a file may flow into an exchange operator, followed by a 
multi-process aggregation fragment.</span><span> </span></p><p>The maximum 
width per node defines the maximum degree of parallelism for any fragment of a 
query, but the setting applies at the level of a single node in the 
cluster.</p><p>The <em>default</em> maximum degree of parallelism per node is 
calculated as follows, with the theoretical maximum automatically scaled back 
(and rounded down) so that only 70% of the actual available capacity is taken 
into account:</p>
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: 
false"><![CDATA[number of active drillbits (typically one per node) 
-* number of cores per node
-* 0.7]]></script>
-<p>For example, on a single-node test system with 2 cores and hyper-threading 
enabled:</p><script type="syntaxhighlighter" class="theme: Default; brush: 
java; gutter: false"><![CDATA[1 * 4 * 0.7 = 3]]></script>
-<p>When you modify the default setting, you can supply any meaningful number. 
The system does not automatically scale down your setting.</p></td></tr><tr><td 
valign="top" colspan="1" >planner.width.max_per_query</td><td valign="top" 
colspan="1" >1000</td><td valign="top" colspan="1" ><p>The max_per_query value 
also sets the maximum degree of parallelism for any given stage of a query, but 
the setting applies to the query as executed by the whole cluster (multiple 
nodes). In effect, the actual maximum width per query is the <em>minimum of two 
values</em>:</p>
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: 
false"><![CDATA[min((number of nodes * width.max_per_node), 
width.max_per_query)]]></script>
-<p>For example, on a 4-node cluster where 
<span><code>width.max_per_node</code> is set to 6 and 
</span><span><code>width.max_per_query</code> is set to 30:</span></p>
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: 
false"><![CDATA[min((4 * 6), 30) = 24]]></script>
-<p>In this case, the effective maximum width per query is 24, not 
30.</p></td></tr><tr><td valign="top" colspan="1" >store.format</td><td 
valign="top" colspan="1" > </td><td valign="top" colspan="1" >Output format for 
data that is written to tables with the CREATE TABLE AS (CTAS) 
command.</td></tr><tr><td valign="top" colspan="1" 
>store.json.all_text_mode</td><td valign="top" colspan="1" 
><p>false</p></td><td valign="top" colspan="1" ><p>This option enables or 
disables text mode. When enabled, Drill reads everything in JSON as a text 
object instead of trying to interpret data types. This allows complicated JSON 
to be read using CASE and CAST.</p></td></tr><tr><td valign="top" 
>store.parquet.block-size</td><td valign="top" ><p>536870912</p></td><td 
valign="top" >T<span style="color: rgb(34,34,34);">arget size for a parquet row 
group, which should be equal to or less than the configured HDFS block size. 
</span></td></tr></tbody></table>
+<ul>
+<li>exec.min_hash_table_size</li>
+</ul>
+
+<p>The default starting size for hash tables. Increasing this size is useful 
for very large aggregations or joins when you have large amounts of memory for 
Drill to use. Drill can spend a lot of time resizing the hash table as it finds 
new data. If you have large data sets, you can increase this hash table size to 
increase performance.</p>
+
+<ul>
+<li>planner.add_producer_consumer</li>
+</ul>
+
+<p>This option enables or disables a secondary reading thread that works out 
of band of the rest of the scanning fragment to prefetch data from disk. If you 
interact with a certain type of storage medium that is slow or does not 
prefetch much data, this option tells Drill to add a producer consumer reading 
thread to the operation. Drill can then assign one thread that focuses on a 
single reading fragment. If Drill is using memory, you can disable this option 
to get better performance. If Drill is using disk space, you should enable this 
option and set a reasonable queue size for the 
planner.producer_consumer_queue_size option.</p>
+
+<ul>
+<li>planner.broadcast_threshold</li>
+</ul>
+
+<p>Threshold, in terms of a number of rows, that determines whether a 
broadcast join is chosen for a query. Regardless of the setting of the 
broadcast_join option (enabled or disabled), a broadcast join is not chosen 
unless the right side of the join is estimated to contain fewer rows than this 
threshold. The intent of this option is to avoid broadcasting too many rows for 
join purposes. Broadcasting involves sending data across nodes and is a 
network-intensive operation. (The &quot;right side&quot; of the join, which may 
itself be a join or simply a table, is determined by cost-based optimizations 
and heuristics during physical planning.)</p>
+
+<ul>
+<li>planner.enable_broadcast_join, planner.enable_hashagg, 
planner.enable_hashjoin, planner.enable_mergejoin, 
planner.enable_multiphase_agg, planner.enable_streamagg</li>
+</ul>
+
+<p>These options enable or disable specific aggregation and join operators for 
queries. These operators are all enabled by default and in general should not 
be disabled.</p><p>Hash aggregation and hash join are hash-based operations. 
Streaming aggregation and merge join are sort-based operations. Both hash-based 
and sort-based operations consume memory; however, currently, hash-based 
operations do not spill to disk as needed, but the sort-based operations do. If 
large hash operations do not fit in memory on your system, you may need to 
disable these operations. Queries will continue to run, using alternative 
plans.</p>
+
+<ul>
+<li>planner.producer_consumer_queue_size</li>
+</ul>
+
+<p>Determines how much data to prefetch from disk (in record batches) out of 
band of query execution. The larger the queue size, the greater the amount of 
memory that the queue and overall query execution consumes.</p>
+
+<ul>
+<li>planner.width.max_per_node</li>
+</ul>
+
+<p>In this context <em>width</em> refers to fanout or distribution potential: 
the ability to run a query in parallel across the cores on a node and the nodes 
on a cluster. A physical plan consists of intermediate operations, known as 
query &quot;fragments,&quot; that run concurrently, yielding opportunities for 
parallelism above and below each exchange operator in the plan. An exchange 
operator represents a breakpoint in the execution flow where processing can be 
distributed. For example, a single-process scan of a file may flow into an 
exchange operator, followed by a multi-process aggregation fragment.</p>
+
+<p>The maximum width per node defines the maximum degree of parallelism for 
any fragment of a query, but the setting applies at the level of a single node 
in the cluster. The <em>default</em> maximum degree of parallelism per node is 
calculated as follows, with the theoretical maximum automatically scaled back 
(and rounded down) so that only 70% of the actual available capacity is taken 
into account: number of active drillbits (typically one per node) * number of 
cores per node * 0.7</p>
+
+<p>For example, on a single-node test system with 2 cores and hyper-threading 
enabled: 1 * 4 * 0.7 = 3</p>
+
+<p>When you modify the default setting, you can supply any meaningful number. 
The system does not automatically scale down your setting.</p>
+
+<ul>
+<li>planner.width.max_per_query</li>
+</ul>
+
+<p>The max_per_query value also sets the maximum degree of parallelism for any 
given stage of a query, but the setting applies to the query as executed by the 
whole cluster (multiple nodes). In effect, the actual maximum width per query 
is the <em>minimum of two values</em>: min((number of nodes * 
width.max_per_node), width.max_per_query)</p>
+
+<p>For example, on a 4-node cluster where <code>width.max_per_node</code> is 
set to 6 and <code>width.max_per_query</code> is set to 30: min((4 * 6), 30) = 
24</p>
+
+<p>In this case, the effective maximum width per query is 24, not 30.</p>
 </div>


Modified: drill/site/trunk/content/drill/docs/start-up-options/index.html
URL: 
http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/start-up-options/index.html?rev=1673293&r1=1673292&r2=1673293&view=diff
==============================================================================
--- drill/site/trunk/content/drill/docs/start-up-options/index.html (original)
+++ drill/site/trunk/content/drill/docs/start-up-options/index.html Mon Apr 13 
21:52:51 2015
@@ -106,10 +106,30 @@ file tells Drill to scan that JAR file o
 <p>You can configure start-up options for each Drillbit in the <code>drill-
 override.conf</code> file located in Drillâs<code>/conf</code> directory.</p>
 
-<p>You may want to configure the following start-up options that control 
certain
-behaviors in Drill:</p>
+<p>The summary of start-up options, also known as boot options, lists default 
values. The following descriptions provide more detail on key options that are 
frequently reconfigured:</p>
 
-<p><table ><tbody><tr><th >Option</th><th >Default Value</th><th 
>Description</th></tr><tr><td valign="top" 
>drill.exec.sys.store.provider</td><td valign="top" >ZooKeeper</td><td 
valign="top" >Defines the persistent storage (PStore) provider. The PStore 
holds configuration and profile data. For more information about PStores, see 
<a href="/docs/persistent-configuration-storage" rel="nofollow">Persistent 
Configuration Storage</a>.</td></tr><tr><td valign="top" 
>drill.exec.buffer.size</td><td valign="top" > </td><td valign="top" >Defines 
the amount of memory available, in terms of record batches, to hold data on the 
downstream side of an operation. Drill pushes data downstream as quickly as 
possible to make data immediately available. This requires Drill to use memory 
to hold the data pending operations. When data on a downstream operation is 
required, that data is immediately available so Drill does not have to go over 
the network to process it. Providing more memory to this option i
 ncreases the speed at which Drill completes a query.</td></tr><tr><td 
valign="top" 
>drill.exec.sort.external.directoriesdrill.exec.sort.external.fs</td><td 
valign="top" > </td><td valign="top" >These options control spooling. The 
drill.exec.sort.external.directories option tells Drill which directory to use 
when spooling. The drill.exec.sort.external.fs option tells Drill which file 
system to use when spooling beyond memory files. <span style="line-height: 
1.4285715;background-color: transparent;"> </span>Drill uses a spool and sort 
operation for beyond memory operations. The sorting operation is designed to 
spool to a Hadoop file system. The default Hadoop file system is a local file 
system in the /tmp directory. Spooling performance (both writing and reading 
back from it) is constrained by the file system. <span style="line-height: 
1.4285715;background-color: transparent;"> </span>For MapR clusters, use 
MapReduce volumes or set up local volumes to use for spooling purposes. Volume
 s improve performance and stripe data across as many disks as 
possible.</td></tr><tr><td valign="top" colspan="1" 
>drill.exec.debug.error_on_leak</td><td valign="top" colspan="1" >True</td><td 
valign="top" colspan="1" >Determines how Drill behaves when memory leaks occur 
during a query. By default, this option is enabled so that queries fail when 
memory leaks occur. If you disable the option, Drill issues a warning when a 
memory leak occurs and completes the query.</td></tr><tr><td valign="top" 
colspan="1" >drill.exec.zk.connect</td><td valign="top" colspan="1" 
>localhost:2181</td><td valign="top" colspan="1" >Provides Drill with the 
ZooKeeper quorum to use to connect to data sources. Change this setting to 
point to the ZooKeeper quorum that you want Drill to use. You must configure 
this option on each Drillbit node.</td></tr><tr><td valign="top" colspan="1" 
>drill.exec.cluster-id</td><td valign="top" colspan="1" 
>my_drillbit_cluster</td><td valign="top" colspan="1" >Identifies the 
 cluster that corresponds with the ZooKeeper quorum indicated. It also provides 
Drill with the name of the cluster used during UDP multicast. You must change 
the default cluster-id if there are multiple clusters on the same subnet. If 
you do not change the ID, the clusters will try to connect to each other to 
create one cluster.</td></tr></tbody></table></div></p>
+<ul>
+<li>drill.exec.sys.store.provider.class<br></li>
+</ul>
+
+<p>Defines the persistent storage (PStore) provider. The <a 
href="/docs/persistent-configuration-storage">PStore</a> holds configuration 
and profile data. </p>
+
+<ul>
+<li>drill.exec.buffer.size</li>
+</ul>
+
+<p>Defines the amount of memory available, in terms of record batches, to hold 
data on the downstream side of an operation. Drill pushes data downstream as 
quickly as possible to make data immediately available. This requires Drill to 
use memory to hold the data pending operations. When data on a downstream 
operation is required, that data is immediately available so Drill does not 
have to go over the network to process it. Providing more memory to this option 
increases the speed at which Drill completes a query.</p>
+
+<ul>
+<li>drill.exec.sort.external.spill.directories</li>
+</ul>
+
+<p>Tells Drill which directory to use when spooling. Drill uses a spool and 
sort operation for beyond memory operations. The sorting operation is designed 
to spool to a Hadoop file system. The default Hadoop file system is a local 
file system in the /tmp directory. Spooling performance (both writing and 
reading back from it) is constrained by the file system. For MapR clusters, use 
MapReduce volumes or set up local volumes to use for spooling purposes. Volumes 
improve performance and stripe data across as many disks as possible.</p>
+
+<ul>
+<li>drill.exec.zk.connect<br>
+Provides Drill with the ZooKeeper quorum to use to connect to data sources. 
Change this setting to point to the ZooKeeper quorum that you want Drill to 
use. You must configure this option on each Drillbit node.</li>
+</ul>
 </div>
 
 

Modified: drill/site/trunk/content/drill/docs/string-manipulation/index.html
URL: 
http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/string-manipulation/index.html?rev=1673293&r1=1673292&r2=1673293&view=diff
==============================================================================
--- drill/site/trunk/content/drill/docs/string-manipulation/index.html 
(original)
+++ drill/site/trunk/content/drill/docs/string-manipulation/index.html Mon Apr 
13 21:52:51 2015
@@ -76,67 +76,67 @@
 </tr>
 </thead><tbody>
 <tr>
-<td><a href="/docs/string-manipulation#byte_substr">BYTE_SUBSTR(string, start 
[, length])</a></td>
+<td><a href="/docs/string-manipulation#byte_substr">BYTE_SUBSTR</a></td>
 <td>byte array or text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#char_length">CHAR_LENGTH(string) or 
character_length(string)</a></td>
+<td><a href="/docs/string-manipulation#char_length">CHAR_LENGTH</a></td>
 <td>int</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#concat">CONCAT(str &quot;any&quot; [, 
str &quot;any&quot; [, ...] ])</a></td>
+<td><a href="/docs/string-manipulation#concat">CONCAT</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#initcap">INITCAP(string)</a></td>
+<td><a href="/docs/string-manipulation#initcap">INITCAP</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#length">LENGTH(string [, encoding name 
])</a></td>
+<td><a href="/docs/string-manipulation#length">LENGTH</a></td>
 <td>int</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#lower">LOWER(string)</a></td>
+<td><a href="/docs/string-manipulation#lower">LOWER</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#lpad">LPAD(string, length [, 
fill])</a></td>
+<td><a href="/docs/string-manipulation#lpad">LPAD</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#ltrim">LTRIM(string [, 
characters])</a></td>
+<td><a href="/docs/string-manipulation#ltrim">LTRIM</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#position">POSITION(substring in 
string)</a></td>
+<td><a href="/docs/string-manipulation#position">POSITION</a></td>
 <td>int</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#regexp_replace">REGEXP_REPLACE(string, 
pattern, replacement</a></td>
+<td><a href="/docs/string-manipulation#regexp_replace">REGEXP_REPLACE</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#rpad">RPAD(string, length [, fill 
])</a></td>
+<td><a href="/docs/string-manipulation#rpad">RPAD</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#rtrim">RTRIM(string [, 
characters])</a></td>
+<td><a href="/docs/string-manipulation#rtrim">RTRIM</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#strpos">STRPOS(string, 
substring)</a></td>
+<td><a href="/docs/string-manipulation#strpos">STRPOS</a></td>
 <td>int</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#substr">SUBSTR(string, from [, 
count])</a></td>
+<td><a href="/docs/string-manipulation#substr">SUBSTR</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#trim">TRIM([position_option] 
[characters] from string)</a></td>
+<td><a href="/docs/string-manipulation#trim">TRIM</a></td>
 <td>text</td>
 </tr>
 <tr>
-<td><a href="/docs/string-manipulation#upper">UPPER(string)</a></td>
+<td><a href="/docs/string-manipulation#upper">UPPER</a></td>
 <td>text</td>
 </tr>
 </tbody></table>
@@ -182,8 +182,12 @@ SELECT CONVERT_FROM(BYTE_SUBSTR(row_key,
 <p>Returns the number of characters in a string.</p>
 
 <h3 id="syntax">Syntax</h3>
-<div class="highlight"><pre><code class="language-text" data-lang="text">( 
CHAR_LENGTH | CHARACTER_LENGTH ) (string);
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">CHAR_LENGTH(string);
 </code></pre></div>
+<h3 id="usage-notes">Usage Notes</h3>
+
+<p>You can use the alias CHARACTER_LENGTH.</p>
+
 <h3 id="example">Example</h3>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT CHAR_LENGTH(&#39;Drill rocks&#39;) FROM sys.version;
 
@@ -296,12 +300,12 @@ SELECT LENGTH(row_key, &#39;UTF8&#39;) F
 </code></pre></div>
 <h2 id="ltrim">LTRIM</h2>
 
-<p>Removes the longest string having only characters specified in the second 
argument string from the beginning of the string.</p>
+<p>Removes any characters from the beginning of string1 that match the 
characters in string2. </p>
 
 <h3 id="syntax">Syntax</h3>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">LTRIM(string, string);
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">LTRIM(string1, string2);
 </code></pre></div>
-<h3 id="example">Example</h3>
+<h3 id="examples">Examples</h3>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT LTRIM(&#39;Apache Drill&#39;, &#39;Apache &#39;) FROM 
sys.version;
 
 +------------+
@@ -310,6 +314,15 @@ SELECT LENGTH(row_key, &#39;UTF8&#39;) F
 | Drill      |
 +------------+
 1 row selected (0.131 seconds)
+
+SELECT LTRIM(&#39;A powerful tool Apache Drill&#39;, &#39;Apache &#39;) FROM 
sys.version;
+
++------------+
+|   EXPR$0   |
++------------+
+| owerful tool Apache Drill |
++------------+
+1 row selected (0.07 seconds)
 </code></pre></div>
 <h2 id="position">POSITION</h2>
 
@@ -330,7 +343,7 @@ SELECT LENGTH(row_key, &#39;UTF8&#39;) F
 </code></pre></div>
 <h2 id="regexp_replace">REGEXP_REPLACE</h2>
 
-<p>Substitutes new text for substrings that match POSIX regular expression 
patterns.</p>
+<p>Substitutes new text for substrings that match <a 
href="http://www.regular-expressions.info/posix.html";>POSIX regular expression 
patterns</a>.</p>
 
 <h3 id="syntax">Syntax</h3>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">REGEXP_REPLACE(source_char, pattern, replacement);
@@ -343,38 +356,26 @@ SELECT LENGTH(row_key, &#39;UTF8&#39;) F
 
 <h3 id="examples">Examples</h3>
 
-<p>Flatten and replace a&#39;s with b&#39;s in this JSON data.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">{&quot;id&quot;:1,&quot;strs&quot;:[&quot;abc&quot;,&quot;acd&quot;]}
-{&quot;id&quot;:2,&quot;strs&quot;:[&quot;ade&quot;,&quot;aef&quot;]}
-
-SELECT id, REGEXP_REPLACE(FLATTEN(strs), &#39;a&#39;,&#39;b&#39;) FROM 
tmp.`regex-flatten.json`;
-
-+------------+------------+
-|     id     |   EXPR$1   |
-+------------+------------+
-| 1          | bbc        |
-| 1          | bcd        |
-| 2          | bde        |
-| 2          | bef        |
-+------------+------------+
-4 rows selected (0.186 seconds)
-</code></pre></div>
-<p>Use the regular expression a. in the same query to replace all a&#39;s and 
the subsequent character.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT ID, REGEXP_REPLACE(FLATTEN(strs), 
&#39;a.&#39;,&#39;b&#39;) FROM tmp.`regex-flatten.json`;
-
-+------------+------------+
-|     id     |   EXPR$1   |
-+------------+------------+
-| 1          | bc         |
-| 1          | bd         |
-| 2          | be         |
-| 2          | bf         |
-+------------+------------+
-4 rows selected (0.132 seconds)
+<p>Replace a&#39;s with b&#39;s in this string.</p>
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT REGEXP_REPLACE(&#39;abc, acd, ade, aef&#39;, 
&#39;a&#39;, &#39;b&#39;) FROM sys.version;
++------------+
+|   EXPR$0   |
++------------+
+| bbc, bcd, bde, bef |
++------------+
+</code></pre></div>
+<p>Use the regular expression <em>a</em> followed by a period (.) in the same 
query to replace all a&#39;s and the subsequent character.</p>
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT REGEXP_REPLACE(&#39;abc, acd, ade, aef&#39;, 
&#39;a.&#39;,&#39;b&#39;) FROM sys.version;
++------------+
+|   EXPR$0   |
++------------+
+| bc, bd, be, bf |
++------------+
+1 row selected (0.099 seconds)
 </code></pre></div>
 <h2 id="rpad">RPAD</h2>
 
-<p>Pads the string to the length specified by appending the fill or a space. 
Truncates the string if longer than the specified length.</p>
+<p>Pads the string to the length specified. Appends the text you specify after 
the fill keyword using spaces for the fill if you provide no text or 
insufficient text to achieve the length.  Truncates the string if longer than 
the specified length.</p>
 
 <h3 id="syntax">Syntax</h3>
 <div class="highlight"><pre><code class="language-text" data-lang="text">RPAD 
(string, length [, fill text]);
@@ -390,12 +391,12 @@ SELECT id, REGEXP_REPLACE(FLATTEN(strs),
 </code></pre></div>
 <h2 id="rtrim">RTRIM</h2>
 
-<p>Removes the longest string having only characters specified in the second 
argument string from the end of the string.</p>
+<p>Removes any characters from the end of string1 that match the characters in 
string2.  </p>
 
 <h3 id="syntax">Syntax</h3>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text">RTRIM(string, string);
+<div class="highlight"><pre><code class="language-text" 
data-lang="text">RTRIM(string1, string2);
 </code></pre></div>
-<h3 id="example">Example</h3>
+<h3 id="examples">Examples</h3>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT RTRIM(&#39;Apache Drill&#39;, &#39;Drill &#39;) FROM 
sys.version;
 
 +------------+
@@ -404,6 +405,14 @@ SELECT id, REGEXP_REPLACE(FLATTEN(strs),
 | Apache     |
 +------------+
 1 row selected (0.135 seconds)
+
+SELECT RTRIM(&#39;1.0 Apache Tomcat 1.0&#39;, &#39;Drill 1.0&#39;) from 
sys.version;
++------------+
+|   EXPR$0   |
++------------+
+| 1.0 Apache Tomcat |
++------------+
+1 row selected (0.088 seconds)
 </code></pre></div>
 <h2 id="strpos">STRPOS</h2>
 
@@ -429,7 +438,11 @@ SELECT id, REGEXP_REPLACE(FLATTEN(strs),
 
 <h3 id="syntax">Syntax</h3>
 
-<p>(SUBSTR | SUBSTRING)(string, x, y)</p>
+<p>SUBSTR(string, x, y)</p>
+
+<h3 id="usage-notes">Usage Notes</h3>
+
+<p>You can use the alias SUBSTRING for this function.</p>
 
 <h3 id="example">Example</h3>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT SUBSTR(&#39;Apache Drill&#39;, 8) FROM sys.version;
@@ -452,10 +465,10 @@ SELECT SUBSTR(&#39;Apache Drill&#39;, 3,
 </code></pre></div>
 <h2 id="trim">TRIM</h2>
 
-<p>Removes the longest string having only the characters from the beginning, 
end, or both ends of the string.</p>
+<p>Removes any characters from the beginning, end, or both sides of string2 
that match the characters in string1.  </p>
 
 <h3 id="syntax">Syntax</h3>
-<div class="highlight"><pre><code class="language-text" data-lang="text">TRIM 
([leading | trailing | both] [characters] from string)
+<div class="highlight"><pre><code class="language-text" data-lang="text">TRIM 
([leading | trailing | both] [string1] from string2)
 </code></pre></div>
 <h3 id="example">Example</h3>
 <div class="highlight"><pre><code class="language-text" 
data-lang="text">SELECT TRIM(trailing &#39;l&#39; from &#39;Drill&#39;) FROM 
sys.version;
@@ -465,6 +478,22 @@ SELECT SUBSTR(&#39;Apache Drill&#39;, 3,
 | Dri        |
 +------------+
 1 row selected (0.172 seconds)
+
+SELECT TRIM(both &#39;l&#39; from &#39;long live Drill&#39;) FROM sys.version;
++------------+
+|   EXPR$0   |
++------------+
+| ong live Dri |
++------------+
+1 row selected (0.087 seconds)
+
+SELECT TRIM(leading &#39;l&#39; from &#39;long live Drill&#39;) FROM 
sys.version;
++------------+
+|   EXPR$0   |
++------------+
+| ong live Drill |
++------------+
+1 row selected (0.077 seconds)
 </code></pre></div>
 <h2 id="upper">UPPER</h2>
 

Modified: drill/site/trunk/content/drill/feed.xml
URL: 
http://svn.apache.org/viewvc/drill/site/trunk/content/drill/feed.xml?rev=1673293&r1=1673292&r2=1673293&view=diff
==============================================================================
--- drill/site/trunk/content/drill/feed.xml (original)
+++ drill/site/trunk/content/drill/feed.xml Mon Apr 13 21:52:51 2015
@@ -6,8 +6,8 @@
 </description>
     <link>/</link>
     <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Mon, 13 Apr 2015 13:19:06 -0700</pubDate>
-    <lastBuildDate>Mon, 13 Apr 2015 13:19:06 -0700</lastBuildDate>
+    <pubDate>Mon, 13 Apr 2015 14:31:36 -0700</pubDate>
+    <lastBuildDate>Mon, 13 Apr 2015 14:31:36 -0700</lastBuildDate>
     <generator>Jekyll v2.5.2</generator>
     
       <item>

svn commit: r1673293 [2/2] - in /drill/site/trunk/content/drill: ./ docs/ docs/aggregate-and-aggregate-statistical/ docs/data-type-conversion/ docs/data-types/ docs/date-time-and-timestamp/ docs/date-time-functions-and-arithmetic/ docs/functions-for-ha...

Reply via email to