Modified: drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html?rev=1673293&r1=1673292&r2=1673293&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html (original) +++ drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html Mon Apr 13 21:52:51 2015 @@ -85,19 +85,59 @@ planning and execution options:</p> you set the options at the session level unless you want the setting to persist across all sessions.</p> -<p>The following table contains planning and execution options that you can set -at the system or session level:</p> +<p>The summary of system options lists default values. The following descriptions provide more detail on some of these options:</p> -<table ><tbody><tr><th >Option name</th><th >Default value</th><th >Description</th></tr><tr><td valign="top" colspan="1" >exec.errors.verbose</td><td valign="top" colspan="1" ><p>false</p></td><td valign="top" colspan="1" ><p>This option enables or disables the verbose message that Drill returns when a query fails. When enabled, Drill provides additional information about failed queries.</p></td></tr><tr><td valign="top" colspan="1" ><span>exec.max_hash_table_size</span></td><td valign="top" colspan="1" >1073741824</td><td valign="top" colspan="1" ><span>The default maximum size for hash tables.</span></td></tr><tr><td valign="top" colspan="1" >exec.min_hash_table_size</td><td valign="top" colspan="1" >65536</td><td valign="top" colspan="1" >The default starting size for hash tables. Increasing this size is useful for very large aggregations or joins when you have large amounts of memory for Drill to use. Drill can spend a lot of time resizing the hash table as it finds new data. I f you have large data sets, you can increase this hash table size to increase performance.</td></tr><tr><td valign="top" colspan="1" >planner.add_producer_consumer</td><td valign="top" colspan="1" ><p>false</p><p> </p></td><td valign="top" colspan="1" ><p>This option enables or disables a secondary reading thread that works out of band of the rest of the scanning fragment to prefetch data from disk. <span style="line-height: 1.4285715;background-color: transparent;">If you interact with a certain type of storage medium that is slow or does not prefetch much data, this option tells Drill to add a producer consumer reading thread to the operation. Drill can then assign one thread that focuses on a single reading fragment. </span></p><p>If Drill is using memory, you can disable this option to get better performance. If Drill is using disk space, you should enable this option and set a reasonable queue size for the planner.producer_consumer_queue_size option.</p></td></tr><tr><td valign ="top" colspan="1" >planner.broadcast_threshold</td><td valign="top" colspan="1" >1000000</td><td valign="top" colspan="1" ><span style="color: rgb(34,34,34);">Threshold, in terms of a number of rows, that determines whether a broadcast join is chosen for a query. Regardless of the setting of the broadcast_join option (enabled or disabled), a broadcast join is not chosen unless the right side of the join is estimated to contain fewer rows than this threshold. The intent of this option is to avoid broadcasting too many rows for join purposes. Broadcasting involves sending data across nodes and is a network-intensive operation. (The "right side" of the join, which may itself be a join or simply a table, is determined by cost-based optimizations and heuristics during physical planning.)</span></td></tr><tr><td valign="top" colspan="1" ><p>planner.enable_broadcast_join<br />planner.enable_hashagg<br />planner.enable_hashjoin<br />planner.enable_mergejoin<br />planner.enable_mu ltiphase_agg<br />planner.enable_streamagg</p></td><td valign="top" colspan="1" >true</td><td valign="top" colspan="1" ><p>These options enable or disable specific aggregation and join operators for queries. These operators are all enabled by default and in general should not be disabled.</p><p>Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory; however, currently, hash-based operations do not spill to disk as needed, but the sort-based operations do. If large hash operations do not fit in memory on your system, you may need to disable these operations. Queries will continue to run, using alternative plans.</p></td></tr><tr><td valign="top" colspan="1" >planner.producer_consumer_queue_size</td><td valign="top" colspan="1" >10</td><td valign="top" colspan="1" >Determines how much data to prefetch from disk (in record batches) out of band of query execution. The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.</td></tr><tr><td valign="top" colspan="1" >planner.slice_target</td><td valign="top" colspan="1" >100000</td><td valign="top" colspan="1" >The number of records manipulated within a fragment before Drill parallelizes them.</td></tr><tr><td valign="top" colspan="1" ><p>planner.width.max_per_node</p><p> </p></td><td valign="top" colspan="1" ><p>The default depends on the number of cores on each node.</p></td><td valign="top" colspan="1" ><p>In this context "width" refers to fanout or distribution potential: the ability to run a query in parallel across the cores on a node and the nodes on a cluster.</p><p><span>A physical plan consists of intermediate operations, known as query "fragments," that run concurrently, yielding opportunities for parallelism above and below each exchange operator in the plan. An exchange operator represents a breakpoint in the execution flow where processing can be distributed. For example, a single-process scan of a file may flow into an exchange operator, followed by a multi-process aggregation fragment.</span><span> </span></p><p>The maximum width per node defines the maximum degree of parallelism for any fragment of a query, but the setting applies at the level of a single node in the cluster.</p><p>The <em>default</em> maximum degree of parallelism per node is calculated as follows, with the theoretical maximum automatically scaled back (and rounded down) so that only 70% of the actual available capacity is taken into account:</p> -<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[number of active drillbits (typically one per node) -* number of cores per node -* 0.7]]></script> -<p>For example, on a single-node test system with 2 cores and hyper-threading enabled:</p><script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[1 * 4 * 0.7 = 3]]></script> -<p>When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.</p></td></tr><tr><td valign="top" colspan="1" >planner.width.max_per_query</td><td valign="top" colspan="1" >1000</td><td valign="top" colspan="1" ><p>The max_per_query value also sets the maximum degree of parallelism for any given stage of a query, but the setting applies to the query as executed by the whole cluster (multiple nodes). In effect, the actual maximum width per query is the <em>minimum of two values</em>:</p> -<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[min((number of nodes * width.max_per_node), width.max_per_query)]]></script> -<p>For example, on a 4-node cluster where <span><code>width.max_per_node</code> is set to 6 and </span><span><code>width.max_per_query</code> is set to 30:</span></p> -<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[min((4 * 6), 30) = 24]]></script> -<p>In this case, the effective maximum width per query is 24, not 30.</p></td></tr><tr><td valign="top" colspan="1" >store.format</td><td valign="top" colspan="1" > </td><td valign="top" colspan="1" >Output format for data that is written to tables with the CREATE TABLE AS (CTAS) command.</td></tr><tr><td valign="top" colspan="1" >store.json.all_text_mode</td><td valign="top" colspan="1" ><p>false</p></td><td valign="top" colspan="1" ><p>This option enables or disables text mode. When enabled, Drill reads everything in JSON as a text object instead of trying to interpret data types. This allows complicated JSON to be read using CASE and CAST.</p></td></tr><tr><td valign="top" >store.parquet.block-size</td><td valign="top" ><p>536870912</p></td><td valign="top" >T<span style="color: rgb(34,34,34);">arget size for a parquet row group, which should be equal to or less than the configured HDFS block size. </span></td></tr></tbody></table> +<ul> +<li>exec.min_hash_table_size</li> +</ul> + +<p>The default starting size for hash tables. Increasing this size is useful for very large aggregations or joins when you have large amounts of memory for Drill to use. Drill can spend a lot of time resizing the hash table as it finds new data. If you have large data sets, you can increase this hash table size to increase performance.</p> + +<ul> +<li>planner.add_producer_consumer</li> +</ul> + +<p>This option enables or disables a secondary reading thread that works out of band of the rest of the scanning fragment to prefetch data from disk. If you interact with a certain type of storage medium that is slow or does not prefetch much data, this option tells Drill to add a producer consumer reading thread to the operation. Drill can then assign one thread that focuses on a single reading fragment. If Drill is using memory, you can disable this option to get better performance. If Drill is using disk space, you should enable this option and set a reasonable queue size for the planner.producer_consumer_queue_size option.</p> + +<ul> +<li>planner.broadcast_threshold</li> +</ul> + +<p>Threshold, in terms of a number of rows, that determines whether a broadcast join is chosen for a query. Regardless of the setting of the broadcast_join option (enabled or disabled), a broadcast join is not chosen unless the right side of the join is estimated to contain fewer rows than this threshold. The intent of this option is to avoid broadcasting too many rows for join purposes. Broadcasting involves sending data across nodes and is a network-intensive operation. (The "right side" of the join, which may itself be a join or simply a table, is determined by cost-based optimizations and heuristics during physical planning.)</p> + +<ul> +<li>planner.enable_broadcast_join, planner.enable_hashagg, planner.enable_hashjoin, planner.enable_mergejoin, planner.enable_multiphase_agg, planner.enable_streamagg</li> +</ul> + +<p>These options enable or disable specific aggregation and join operators for queries. These operators are all enabled by default and in general should not be disabled.</p><p>Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory; however, currently, hash-based operations do not spill to disk as needed, but the sort-based operations do. If large hash operations do not fit in memory on your system, you may need to disable these operations. Queries will continue to run, using alternative plans.</p> + +<ul> +<li>planner.producer_consumer_queue_size</li> +</ul> + +<p>Determines how much data to prefetch from disk (in record batches) out of band of query execution. The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.</p> + +<ul> +<li>planner.width.max_per_node</li> +</ul> + +<p>In this context <em>width</em> refers to fanout or distribution potential: the ability to run a query in parallel across the cores on a node and the nodes on a cluster. A physical plan consists of intermediate operations, known as query "fragments," that run concurrently, yielding opportunities for parallelism above and below each exchange operator in the plan. An exchange operator represents a breakpoint in the execution flow where processing can be distributed. For example, a single-process scan of a file may flow into an exchange operator, followed by a multi-process aggregation fragment.</p> + +<p>The maximum width per node defines the maximum degree of parallelism for any fragment of a query, but the setting applies at the level of a single node in the cluster. The <em>default</em> maximum degree of parallelism per node is calculated as follows, with the theoretical maximum automatically scaled back (and rounded down) so that only 70% of the actual available capacity is taken into account: number of active drillbits (typically one per node) * number of cores per node * 0.7</p> + +<p>For example, on a single-node test system with 2 cores and hyper-threading enabled: 1 * 4 * 0.7 = 3</p> + +<p>When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.</p> + +<ul> +<li>planner.width.max_per_query</li> +</ul> + +<p>The max_per_query value also sets the maximum degree of parallelism for any given stage of a query, but the setting applies to the query as executed by the whole cluster (multiple nodes). In effect, the actual maximum width per query is the <em>minimum of two values</em>: min((number of nodes * width.max_per_node), width.max_per_query)</p> + +<p>For example, on a 4-node cluster where <code>width.max_per_node</code> is set to 6 and <code>width.max_per_query</code> is set to 30: min((4 * 6), 30) = 24</p> + +<p>In this case, the effective maximum width per query is 24, not 30.</p> </div>
Modified: drill/site/trunk/content/drill/docs/start-up-options/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/start-up-options/index.html?rev=1673293&r1=1673292&r2=1673293&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/start-up-options/index.html (original) +++ drill/site/trunk/content/drill/docs/start-up-options/index.html Mon Apr 13 21:52:51 2015 @@ -106,10 +106,30 @@ file tells Drill to scan that JAR file o <p>You can configure start-up options for each Drillbit in the <code>drill- override.conf</code> file located in Drillâs<code>/conf</code> directory.</p> -<p>You may want to configure the following start-up options that control certain -behaviors in Drill:</p> +<p>The summary of start-up options, also known as boot options, lists default values. The following descriptions provide more detail on key options that are frequently reconfigured:</p> -<p><table ><tbody><tr><th >Option</th><th >Default Value</th><th >Description</th></tr><tr><td valign="top" >drill.exec.sys.store.provider</td><td valign="top" >ZooKeeper</td><td valign="top" >Defines the persistent storage (PStore) provider. The PStore holds configuration and profile data. For more information about PStores, see <a href="/docs/persistent-configuration-storage" rel="nofollow">Persistent Configuration Storage</a>.</td></tr><tr><td valign="top" >drill.exec.buffer.size</td><td valign="top" > </td><td valign="top" >Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this option i ncreases the speed at which Drill completes a query.</td></tr><tr><td valign="top" >drill.exec.sort.external.directoriesdrill.exec.sort.external.fs</td><td valign="top" > </td><td valign="top" >These options control spooling. The drill.exec.sort.external.directories option tells Drill which directory to use when spooling. The drill.exec.sort.external.fs option tells Drill which file system to use when spooling beyond memory files. <span style="line-height: 1.4285715;background-color: transparent;"> </span>Drill uses a spool and sort operation for beyond memory operations. The sorting operation is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system. <span style="line-height: 1.4285715;background-color: transparent;"> </span>For MapR clusters, use MapReduce volumes or set up local volumes to use for spooling purposes. Volume s improve performance and stripe data across as many disks as possible.</td></tr><tr><td valign="top" colspan="1" >drill.exec.debug.error_on_leak</td><td valign="top" colspan="1" >True</td><td valign="top" colspan="1" >Determines how Drill behaves when memory leaks occur during a query. By default, this option is enabled so that queries fail when memory leaks occur. If you disable the option, Drill issues a warning when a memory leak occurs and completes the query.</td></tr><tr><td valign="top" colspan="1" >drill.exec.zk.connect</td><td valign="top" colspan="1" >localhost:2181</td><td valign="top" colspan="1" >Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.</td></tr><tr><td valign="top" colspan="1" >drill.exec.cluster-id</td><td valign="top" colspan="1" >my_drillbit_cluster</td><td valign="top" colspan="1" >Identifies the cluster that corresponds with the ZooKeeper quorum indicated. It also provides Drill with the name of the cluster used during UDP multicast. You must change the default cluster-id if there are multiple clusters on the same subnet. If you do not change the ID, the clusters will try to connect to each other to create one cluster.</td></tr></tbody></table></div></p> +<ul> +<li>drill.exec.sys.store.provider.class<br></li> +</ul> + +<p>Defines the persistent storage (PStore) provider. The <a href="/docs/persistent-configuration-storage">PStore</a> holds configuration and profile data. </p> + +<ul> +<li>drill.exec.buffer.size</li> +</ul> + +<p>Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this option increases the speed at which Drill completes a query.</p> + +<ul> +<li>drill.exec.sort.external.spill.directories</li> +</ul> + +<p>Tells Drill which directory to use when spooling. Drill uses a spool and sort operation for beyond memory operations. The sorting operation is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system. For MapR clusters, use MapReduce volumes or set up local volumes to use for spooling purposes. Volumes improve performance and stripe data across as many disks as possible.</p> + +<ul> +<li>drill.exec.zk.connect<br> +Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.</li> +</ul> </div> Modified: drill/site/trunk/content/drill/docs/string-manipulation/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/string-manipulation/index.html?rev=1673293&r1=1673292&r2=1673293&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/string-manipulation/index.html (original) +++ drill/site/trunk/content/drill/docs/string-manipulation/index.html Mon Apr 13 21:52:51 2015 @@ -76,67 +76,67 @@ </tr> </thead><tbody> <tr> -<td><a href="/docs/string-manipulation#byte_substr">BYTE_SUBSTR(string, start [, length])</a></td> +<td><a href="/docs/string-manipulation#byte_substr">BYTE_SUBSTR</a></td> <td>byte array or text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#char_length">CHAR_LENGTH(string) or character_length(string)</a></td> +<td><a href="/docs/string-manipulation#char_length">CHAR_LENGTH</a></td> <td>int</td> </tr> <tr> -<td><a href="/docs/string-manipulation#concat">CONCAT(str "any" [, str "any" [, ...] ])</a></td> +<td><a href="/docs/string-manipulation#concat">CONCAT</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#initcap">INITCAP(string)</a></td> +<td><a href="/docs/string-manipulation#initcap">INITCAP</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#length">LENGTH(string [, encoding name ])</a></td> +<td><a href="/docs/string-manipulation#length">LENGTH</a></td> <td>int</td> </tr> <tr> -<td><a href="/docs/string-manipulation#lower">LOWER(string)</a></td> +<td><a href="/docs/string-manipulation#lower">LOWER</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#lpad">LPAD(string, length [, fill])</a></td> +<td><a href="/docs/string-manipulation#lpad">LPAD</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#ltrim">LTRIM(string [, characters])</a></td> +<td><a href="/docs/string-manipulation#ltrim">LTRIM</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#position">POSITION(substring in string)</a></td> +<td><a href="/docs/string-manipulation#position">POSITION</a></td> <td>int</td> </tr> <tr> -<td><a href="/docs/string-manipulation#regexp_replace">REGEXP_REPLACE(string, pattern, replacement</a></td> +<td><a href="/docs/string-manipulation#regexp_replace">REGEXP_REPLACE</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#rpad">RPAD(string, length [, fill ])</a></td> +<td><a href="/docs/string-manipulation#rpad">RPAD</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#rtrim">RTRIM(string [, characters])</a></td> +<td><a href="/docs/string-manipulation#rtrim">RTRIM</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#strpos">STRPOS(string, substring)</a></td> +<td><a href="/docs/string-manipulation#strpos">STRPOS</a></td> <td>int</td> </tr> <tr> -<td><a href="/docs/string-manipulation#substr">SUBSTR(string, from [, count])</a></td> +<td><a href="/docs/string-manipulation#substr">SUBSTR</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#trim">TRIM([position_option] [characters] from string)</a></td> +<td><a href="/docs/string-manipulation#trim">TRIM</a></td> <td>text</td> </tr> <tr> -<td><a href="/docs/string-manipulation#upper">UPPER(string)</a></td> +<td><a href="/docs/string-manipulation#upper">UPPER</a></td> <td>text</td> </tr> </tbody></table> @@ -182,8 +182,12 @@ SELECT CONVERT_FROM(BYTE_SUBSTR(row_key, <p>Returns the number of characters in a string.</p> <h3 id="syntax">Syntax</h3> -<div class="highlight"><pre><code class="language-text" data-lang="text">( CHAR_LENGTH | CHARACTER_LENGTH ) (string); +<div class="highlight"><pre><code class="language-text" data-lang="text">CHAR_LENGTH(string); </code></pre></div> +<h3 id="usage-notes">Usage Notes</h3> + +<p>You can use the alias CHARACTER_LENGTH.</p> + <h3 id="example">Example</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT CHAR_LENGTH('Drill rocks') FROM sys.version; @@ -296,12 +300,12 @@ SELECT LENGTH(row_key, 'UTF8') F </code></pre></div> <h2 id="ltrim">LTRIM</h2> -<p>Removes the longest string having only characters specified in the second argument string from the beginning of the string.</p> +<p>Removes any characters from the beginning of string1 that match the characters in string2. </p> <h3 id="syntax">Syntax</h3> -<div class="highlight"><pre><code class="language-text" data-lang="text">LTRIM(string, string); +<div class="highlight"><pre><code class="language-text" data-lang="text">LTRIM(string1, string2); </code></pre></div> -<h3 id="example">Example</h3> +<h3 id="examples">Examples</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT LTRIM('Apache Drill', 'Apache ') FROM sys.version; +------------+ @@ -310,6 +314,15 @@ SELECT LENGTH(row_key, 'UTF8') F | Drill | +------------+ 1 row selected (0.131 seconds) + +SELECT LTRIM('A powerful tool Apache Drill', 'Apache ') FROM sys.version; + ++------------+ +| EXPR$0 | ++------------+ +| owerful tool Apache Drill | ++------------+ +1 row selected (0.07 seconds) </code></pre></div> <h2 id="position">POSITION</h2> @@ -330,7 +343,7 @@ SELECT LENGTH(row_key, 'UTF8') F </code></pre></div> <h2 id="regexp_replace">REGEXP_REPLACE</h2> -<p>Substitutes new text for substrings that match POSIX regular expression patterns.</p> +<p>Substitutes new text for substrings that match <a href="http://www.regular-expressions.info/posix.html">POSIX regular expression patterns</a>.</p> <h3 id="syntax">Syntax</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">REGEXP_REPLACE(source_char, pattern, replacement); @@ -343,38 +356,26 @@ SELECT LENGTH(row_key, 'UTF8') F <h3 id="examples">Examples</h3> -<p>Flatten and replace a's with b's in this JSON data.</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">{"id":1,"strs":["abc","acd"]} -{"id":2,"strs":["ade","aef"]} - -SELECT id, REGEXP_REPLACE(FLATTEN(strs), 'a','b') FROM tmp.`regex-flatten.json`; - -+------------+------------+ -| id | EXPR$1 | -+------------+------------+ -| 1 | bbc | -| 1 | bcd | -| 2 | bde | -| 2 | bef | -+------------+------------+ -4 rows selected (0.186 seconds) -</code></pre></div> -<p>Use the regular expression a. in the same query to replace all a's and the subsequent character.</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT ID, REGEXP_REPLACE(FLATTEN(strs), 'a.','b') FROM tmp.`regex-flatten.json`; - -+------------+------------+ -| id | EXPR$1 | -+------------+------------+ -| 1 | bc | -| 1 | bd | -| 2 | be | -| 2 | bf | -+------------+------------+ -4 rows selected (0.132 seconds) +<p>Replace a's with b's in this string.</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT REGEXP_REPLACE('abc, acd, ade, aef', 'a', 'b') FROM sys.version; ++------------+ +| EXPR$0 | ++------------+ +| bbc, bcd, bde, bef | ++------------+ +</code></pre></div> +<p>Use the regular expression <em>a</em> followed by a period (.) in the same query to replace all a's and the subsequent character.</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT REGEXP_REPLACE('abc, acd, ade, aef', 'a.','b') FROM sys.version; ++------------+ +| EXPR$0 | ++------------+ +| bc, bd, be, bf | ++------------+ +1 row selected (0.099 seconds) </code></pre></div> <h2 id="rpad">RPAD</h2> -<p>Pads the string to the length specified by appending the fill or a space. Truncates the string if longer than the specified length.</p> +<p>Pads the string to the length specified. Appends the text you specify after the fill keyword using spaces for the fill if you provide no text or insufficient text to achieve the length. Truncates the string if longer than the specified length.</p> <h3 id="syntax">Syntax</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">RPAD (string, length [, fill text]); @@ -390,12 +391,12 @@ SELECT id, REGEXP_REPLACE(FLATTEN(strs), </code></pre></div> <h2 id="rtrim">RTRIM</h2> -<p>Removes the longest string having only characters specified in the second argument string from the end of the string.</p> +<p>Removes any characters from the end of string1 that match the characters in string2. </p> <h3 id="syntax">Syntax</h3> -<div class="highlight"><pre><code class="language-text" data-lang="text">RTRIM(string, string); +<div class="highlight"><pre><code class="language-text" data-lang="text">RTRIM(string1, string2); </code></pre></div> -<h3 id="example">Example</h3> +<h3 id="examples">Examples</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT RTRIM('Apache Drill', 'Drill ') FROM sys.version; +------------+ @@ -404,6 +405,14 @@ SELECT id, REGEXP_REPLACE(FLATTEN(strs), | Apache | +------------+ 1 row selected (0.135 seconds) + +SELECT RTRIM('1.0 Apache Tomcat 1.0', 'Drill 1.0') from sys.version; ++------------+ +| EXPR$0 | ++------------+ +| 1.0 Apache Tomcat | ++------------+ +1 row selected (0.088 seconds) </code></pre></div> <h2 id="strpos">STRPOS</h2> @@ -429,7 +438,11 @@ SELECT id, REGEXP_REPLACE(FLATTEN(strs), <h3 id="syntax">Syntax</h3> -<p>(SUBSTR | SUBSTRING)(string, x, y)</p> +<p>SUBSTR(string, x, y)</p> + +<h3 id="usage-notes">Usage Notes</h3> + +<p>You can use the alias SUBSTRING for this function.</p> <h3 id="example">Example</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT SUBSTR('Apache Drill', 8) FROM sys.version; @@ -452,10 +465,10 @@ SELECT SUBSTR('Apache Drill', 3, </code></pre></div> <h2 id="trim">TRIM</h2> -<p>Removes the longest string having only the characters from the beginning, end, or both ends of the string.</p> +<p>Removes any characters from the beginning, end, or both sides of string2 that match the characters in string1. </p> <h3 id="syntax">Syntax</h3> -<div class="highlight"><pre><code class="language-text" data-lang="text">TRIM ([leading | trailing | both] [characters] from string) +<div class="highlight"><pre><code class="language-text" data-lang="text">TRIM ([leading | trailing | both] [string1] from string2) </code></pre></div> <h3 id="example">Example</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT TRIM(trailing 'l' from 'Drill') FROM sys.version; @@ -465,6 +478,22 @@ SELECT SUBSTR('Apache Drill', 3, | Dri | +------------+ 1 row selected (0.172 seconds) + +SELECT TRIM(both 'l' from 'long live Drill') FROM sys.version; ++------------+ +| EXPR$0 | ++------------+ +| ong live Dri | ++------------+ +1 row selected (0.087 seconds) + +SELECT TRIM(leading 'l' from 'long live Drill') FROM sys.version; ++------------+ +| EXPR$0 | ++------------+ +| ong live Drill | ++------------+ +1 row selected (0.077 seconds) </code></pre></div> <h2 id="upper">UPPER</h2> Modified: drill/site/trunk/content/drill/feed.xml URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/feed.xml?rev=1673293&r1=1673292&r2=1673293&view=diff ============================================================================== --- drill/site/trunk/content/drill/feed.xml (original) +++ drill/site/trunk/content/drill/feed.xml Mon Apr 13 21:52:51 2015 @@ -6,8 +6,8 @@ </description> <link>/</link> <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/> - <pubDate>Mon, 13 Apr 2015 13:19:06 -0700</pubDate> - <lastBuildDate>Mon, 13 Apr 2015 13:19:06 -0700</lastBuildDate> + <pubDate>Mon, 13 Apr 2015 14:31:36 -0700</pubDate> + <lastBuildDate>Mon, 13 Apr 2015 14:31:36 -0700</lastBuildDate> <generator>Jekyll v2.5.2</generator> <item>