http://git-wip-us.apache.org/repos/asf/impala/blob/e8ee827a/docs/topics/impala_misc_functions.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_misc_functions.xml b/docs/topics/impala_misc_functions.xml index 1b5bfd4..b2cfa62 100644 --- a/docs/topics/impala_misc_functions.xml +++ b/docs/topics/impala_misc_functions.xml @@ -21,7 +21,13 @@ under the License. <concept id="misc_functions"> <title>Impala Miscellaneous Functions</title> - <titlealts audience="PDF"><navtitle>Miscellaneous Functions</navtitle></titlealts> + + <titlealts audience="PDF"> + + <navtitle>Miscellaneous Functions</navtitle> + + </titlealts> + <prolog> <metadata> <data name="Category" value="Impala"/> @@ -36,23 +42,58 @@ under the License. <conbody> <p> - Impala supports the following utility functions that do not operate on a particular column or data type: + Impala supports the following utility functions that do not operate on a particular column + or data type: </p> + <ul> + <li> + <xref href="#misc_functions/current_database">CURRENT_DATABASE</xref> + </li> + + <li> + <xref href="#misc_functions/effective_user">EFFECTIVE_USER</xref> + </li> + + <li> + <xref href="#misc_functions/pid">PID</xref> + </li> + + <li> + <xref href="#misc_functions/sleep">SLEEP</xref> + </li> + + <li> + <xref href="#misc_functions/user">USER</xref> + </li> + + <li> + <xref href="#misc_functions/uuid">UUID</xref> + </li> + + <li> + <xref href="#misc_functions/version">VERSION</xref> + </li> + + <li> + <xref href="#misc_functions/coordinator">COORDINATOR</xref> + </li> + </ul> + <dl> <dlentry rev="1.3.0" id="current_database"> <dt> - <codeph>current_database()</codeph> + CURRENT_DATABASE() </dt> <dd> - <indexterm audience="hidden">current_database() function</indexterm> - <b>Purpose:</b> Returns the database that the session is currently using, either <codeph>default</codeph> - if no database has been selected, or whatever database the session switched to through a - <codeph>USE</codeph> statement or the <cmdname>impalad</cmdname><codeph>-d</codeph> option. + <b>Purpose:</b> Returns the database that the session is currently using, either + <codeph>default</codeph> if no database has been selected, or whatever database the + session switched to through a <codeph>USE</codeph> statement or the + <cmdname>impalad</cmdname> <codeph>-d</codeph> option. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -61,16 +102,16 @@ under the License. <dlentry rev="5.4.5" id="effective_user"> <dt> - <codeph>effective_user()</codeph> + EFFECTIVE_USER() </dt> <dd> - <indexterm audience="hidden">effective_user() function</indexterm> - <b>Purpose:</b> Typically returns the same value as <codeph>user()</codeph>, - except if delegation is enabled, in which case it returns the ID of the delegated user. + <b>Purpose:</b> Typically returns the same value as <codeph>USER()</codeph>. If + delegation is enabled, it returns the ID of the delegated user. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p> <b>Added in:</b> <keyword keyref="impala225"/> </p> @@ -81,37 +122,38 @@ under the License. <dlentry rev="1.3.0" id="pid"> <dt> - <codeph>pid()</codeph> + PID() </dt> <dd> - <indexterm audience="hidden">pid() function</indexterm> - <b>Purpose:</b> Returns the process ID of the <cmdname>impalad</cmdname> daemon that the session is - connected to. You can use it during low-level debugging, to issue Linux commands that trace, show the - arguments, and so on the <cmdname>impalad</cmdname> process. + <b>Purpose:</b> Returns the process ID of the <cmdname>impalad</cmdname> daemon that + the session is connected to. You can use it during low-level debugging, to issue Linux + commands that trace, show the arguments, and so on the <cmdname>impalad</cmdname> + process. <p> - <b>Return type:</b> <codeph>int</codeph> + <b>Return type:</b> <codeph>INT</codeph> </p> </dd> </dlentry> - <dlentry audience="hidden" id="sleep"> + <dlentry id="sleep"> <dt> - <codeph>sleep(int ms)</codeph> + SLEEP(INT ms) </dt> <dd> - <indexterm audience="hidden">sleep() function</indexterm> - <b>Purpose:</b> Pauses the query for a specified number of milliseconds. For slowing down queries with - small result sets enough to monitor runtime execution, memory usage, or other factors that otherwise - would be difficult to capture during the brief interval of query execution. When used in the - <codeph>SELECT</codeph> list, it is called once for each row in the result set; adjust the number of - milliseconds accordingly. For example, a query <codeph>SELECT *, sleep(5) FROM - table_with_1000_rows</codeph> would take at least 5 seconds to complete (5 milliseconds * 1000 rows in - result set). To avoid an excessive number of concurrent queries, use this function for troubleshooting on - test and development systems, not for production queries. + <b>Purpose:</b> Pauses the query for a specified number of milliseconds. For slowing + down queries with small result sets enough to monitor runtime execution, memory usage, + or other factors that otherwise would be difficult to capture during the brief + interval of query execution. When used in the <codeph>SELECT</codeph> list, it is + called once for each row in the result set; adjust the number of milliseconds + accordingly. For example, a query <codeph>SELECT *, SLEEP(5) FROM + table_with_1000_rows</codeph> would take at least 5 seconds to complete (5 + milliseconds * 1000 rows in result set). To avoid an excessive number of concurrent + queries, use this function for troubleshooting on test and development systems, not + for production queries. <p> <b>Return type:</b> N/A </p> @@ -122,22 +164,25 @@ under the License. <dlentry rev="1.1" id="user"> <dt> - <codeph>user()</codeph> + USER() </dt> <dd> - <indexterm audience="hidden">user() function</indexterm> - <b>Purpose:</b> Returns the username of the Linux user who is connected to the <cmdname>impalad</cmdname> - daemon. Typically called a single time, in a query without any <codeph>FROM</codeph> clause, to - understand how authorization settings apply in a security context; once you know the logged-in username, - you can check which groups that user belongs to, and from the list of groups you can check which roles - are available to those groups through the authorization policy file. + <b>Purpose:</b> Returns the username of the Linux user who is connected to the + <cmdname>impalad</cmdname> daemon. Typically called a single time, in a query without + any <codeph>FROM</codeph> clause, to understand how authorization settings apply in a + security context; once you know the logged-in username, you can check which groups + that user belongs to, and from the list of groups you can check which roles are + available to those groups through the authorization policy file. <p conref="../shared/impala_common.xml#common/user_kerberized"/> + <p> - When delegation is enabled, consider calling the <codeph>effective_user()</codeph> function instead. + When delegation is enabled, consider calling the <codeph>EFFECTIVE_USER()</codeph> + function instead. </p> + <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -146,25 +191,33 @@ under the License. <dlentry rev="2.5.0 IMPALA-1477" id="uuid"> <dt> - <codeph>uuid()</codeph> + UUID() </dt> <dd> - <indexterm audience="hidden">uuid() function</indexterm> - <b>Purpose:</b> Returns a <xref href="https://en.wikipedia.org/wiki/Universally_unique_identifier" scope="external" format="html">universal unique identifier</xref>, a 128-bit value encoded as a string with groups of hexadecimal digits separated by dashes. + <b>Purpose:</b> Returns a + <xref + href="https://en.wikipedia.org/wiki/Universally_unique_identifier" + scope="external" format="html">universal + unique identifier</xref>, a 128-bit value encoded as a string with groups of + hexadecimal digits separated by dashes. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/added_in_250"/> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> + <p> - Ascending numeric sequences of type <codeph>BIGINT</codeph> are often used - as identifiers within a table, and as join keys across multiple tables. - The <codeph>uuid()</codeph> value is a convenient alternative that does not - require storing or querying the highest sequence number. For example, you - can use it to quickly construct new unique identifiers during a data import job, - or to combine data from different tables without the likelihood of ID collisions. + Ascending numeric sequences of type <codeph>BIGINT</codeph> are often used as + identifiers within a table, and as join keys across multiple tables. The + <codeph>UUID()</codeph> value is a convenient alternative that does not require + storing or querying the highest sequence number. For example, you can use it to + quickly construct new unique identifiers during a data import job, or to combine + data from different tables without the likelihood of ID collisions. </p> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <codeblock> -- Each call to uuid() produces a new arbitrary value. @@ -194,41 +247,44 @@ select uuid() from four_row_table; <dlentry id="version"> <dt> - <codeph>version()</codeph> + VERSION() </dt> <dd> - <indexterm audience="hidden">version() function</indexterm> - <b>Purpose:</b> Returns information such as the precise version number and build date for the - <codeph>impalad</codeph> daemon that you are currently connected to. Typically used to confirm that you - are connected to the expected level of Impala to use a particular feature, or to connect to several nodes - and confirm they are all running the same level of <cmdname>impalad</cmdname>. + <b>Purpose:</b> Returns information such as the precise version number and build date + for the <codeph>impalad</codeph> daemon that you are currently connected to. Typically + used to confirm that you are connected to the expected level of Impala to use a + particular feature, or to connect to several nodes and confirm they are all running + the same level of <cmdname>impalad</cmdname>. <p> - <b>Return type:</b> <codeph>string</codeph> (with one or more embedded newlines) + <b>Return type:</b> <codeph>STRING</codeph> (with one or more embedded newlines) </p> </dd> </dlentry> + <dlentry id="coordinator" rev="IMPALA-589"> - - <dlentry id="coordinator"> <dt> - <codeph>coordinator()</codeph> + COORDINATOR() </dt> <dd> - <indexterm audience="hidden">coordinator() function</indexterm> <b>Purpose:</b> Returns the name of the host which is running the - <codeph>impalad</codeph> daemon that is acting as the <codeph>coordinator</codeph> - for the curent query. + <codeph>impalad</codeph> daemon that is acting as the <codeph>coordinator</codeph> for + the current query. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> + </p> + + <p> + <b>Added in:</b> <keyword keyref="impala31"/> </p> </dd> </dlentry> - </dl> + </conbody> + </concept>
http://git-wip-us.apache.org/repos/asf/impala/blob/e8ee827a/docs/topics/impala_string_functions.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_string_functions.xml b/docs/topics/impala_string_functions.xml index ed5d028..a32dbf7 100644 --- a/docs/topics/impala_string_functions.xml +++ b/docs/topics/impala_string_functions.xml @@ -1,4 +1,5 @@ -<?xml version="1.0" encoding="UTF-8"?><!-- +<?xml version="1.0" encoding="UTF-8"?> +<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information @@ -20,7 +21,13 @@ under the License. <concept id="string_functions"> <title>Impala String Functions</title> - <titlealts audience="PDF"><navtitle>String Functions</navtitle></titlealts> + + <titlealts audience="PDF"> + + <navtitle>String Functions</navtitle> + + </titlealts> + <prolog> <metadata> <data name="Category" value="Impala"/> @@ -35,20 +42,21 @@ under the License. <conbody> <p rev="2.0.0"> - String functions are classified as those primarily accepting or returning <codeph>STRING</codeph>, - <codeph>VARCHAR</codeph>, or <codeph>CHAR</codeph> data types, for example to measure the length of a string - or concatenate two strings together. + String functions are classified as those primarily accepting or returning + <codeph>STRING</codeph>, <codeph>VARCHAR</codeph>, or <codeph>CHAR</codeph> data types, + for example to measure the length of a string or concatenate two strings together. <ul> <li> - All the functions that accept <codeph>STRING</codeph> arguments also accept the <codeph>VARCHAR</codeph> - and <codeph>CHAR</codeph> types introduced in Impala 2.0. + All the functions that accept <codeph>STRING</codeph> arguments also accept the + <codeph>VARCHAR</codeph> and <codeph>CHAR</codeph> types introduced in Impala 2.0. </li> <li> - Whenever <codeph>VARCHAR</codeph> or <codeph>CHAR</codeph> values are passed to a function that returns a - string value, the return type is normalized to <codeph>STRING</codeph>. For example, a call to - <codeph>concat()</codeph> with a mix of <codeph>STRING</codeph>, <codeph>VARCHAR</codeph>, and - <codeph>CHAR</codeph> arguments produces a <codeph>STRING</codeph> result. + Whenever <codeph>VARCHAR</codeph> or <codeph>CHAR</codeph> values are passed to a + function that returns a string value, the return type is normalized to + <codeph>STRING</codeph>. For example, a call to <codeph>CONCAT()</codeph> with a mix + of <codeph>STRING</codeph>, <codeph>VARCHAR</codeph>, and <codeph>CHAR</codeph> + arguments produces a <codeph>STRING</codeph> result. </li> </ul> </p> @@ -56,8 +64,9 @@ under the License. <p conref="../shared/impala_common.xml#common/related_info"/> <p> - The string functions operate mainly on these data types: <xref href="impala_string.xml#string"/>, - <xref href="impala_varchar.xml#varchar"/>, and <xref href="impala_char.xml#char"/>. + The string functions operate mainly on these data types: + <xref href="impala_string.xml#string"/>, <xref href="impala_varchar.xml#varchar"/>, and + <xref href="impala_char.xml#char"/>. </p> <p> @@ -68,18 +77,167 @@ under the License. Impala supports the following string functions: </p> + <ul> + <li> + <xref href="#string_functions/ascii">ASCII</xref> + </li> + + <li> + <xref href="#string_functions/base64decode">BASE64DECODE</xref> + </li> + + <li> + <xref href="#string_functions/base64encode">BASE64ENCODE</xref> + </li> + + <li> + <xref href="#string_functions/btrim">BTRIM</xref> + </li> + + <li> + <xref href="#string_functions/char_length">CHAR_LENGTH</xref> + </li> + + <li> + <xref href="#string_functions/chr">CHR</xref> + </li> + + <li> + <xref href="#string_functions/concat">CONCAT</xref> + </li> + + <li> + <xref href="#string_functions/concat_ws">CONCAT_WS</xref> + </li> + + <li> + <xref href="#string_functions/find_in_set">FIND_IN_SET</xref> + </li> + + <li> + <xref href="#string_functions/group_concat">GROUP_CONCAT</xref> + </li> + + <li> + <xref href="#string_functions/initcap">INITCAP</xref> + </li> + + <li> + <xref href="#string_functions/instr">INSTR</xref> + </li> + + <li> + <xref href="#string_functions/left">LEFT</xref> + </li> + + <li> + <xref href="#string_functions/length">LENGTH</xref> + </li> + + <li> + <xref href="#string_functions/locate">LOCATE</xref> + </li> + + <li> + <xref href="#string_functions/lower">LOWER, LCASE</xref> + </li> + + <li> + <xref href="#string_functions/lpad">LPAD</xref> + </li> + + <li> + <xref href="#string_functions/ltrim">LTRI </xref> + </li> + + <li> + <xref href="#string_functions/parse_url">PARSE_URL</xref> + </li> + + <li> + <xref href="#string_functions/regexp_escape">REGEXP_ESCAPE</xref> + </li> + + <li> + <xref href="#string_functions/regexp_extract">REGEXP_EXTRACT</xref> + </li> + + <li> + <xref href="#string_functions/regexp_like">REGEXP_LIKE</xref> + </li> + + <li> + <xref href="#string_functions/regexp_replace">REGEXP_REPLACE</xref> + </li> + + <li> + <xref href="#string_functions/repeat">REPEAT</xref> + </li> + + <li> + <xref href="#string_functions/replace">REPLACE</xref> + </li> + + <li> + <xref href="#string_functions/reverse">REVERSE</xref> + </li> + + <li> + <xref href="#string_functions/right">RIGHT</xref> + </li> + + <li> + <xref href="#string_functions/rpad">RPAD</xref> + </li> + + <li> + <xref href="#string_functions/rtrim">RTRIM</xref> + </li> + + <li> + <xref href="#string_functions/space">SPACE</xref> + </li> + + <li> + <xref href="#string_functions/split_part">SPLIT_PART</xref> + </li> + + <li> + <xref href="#string_functions/strleft">STRLEFT</xref> + </li> + + <li> + <xref href="#string_functions/strright">STRRIGHT</xref> + </li> + + <li> + <xref href="#string_functions/substr">SUBSTR, SUBSTRING</xref> + </li> + + <li> + <xref href="#string_functions/translate">TRANSLATE</xref> + </li> + + <li> + <xref href="#string_functions/trim">TRIM</xref> + </li> + + <li> + <xref href="#string_functions/upper">UPPER, UCASE</xref> + </li> + </ul> + <dl> <dlentry id="ascii"> <dt> - <codeph>ascii(string str)</codeph> + ASCII(STRING str) </dt> <dd> - <indexterm audience="hidden">ascii() function</indexterm> <b>Purpose:</b> Returns the numeric ASCII code of the first character of the argument. <p> - <b>Return type:</b> <codeph>int</codeph> + <b>Return type:</b> <codeph>INT</codeph> </p> </dd> @@ -88,25 +246,33 @@ under the License. <dlentry id="base64decode" rev="2.6.0 IMPALA-2107"> <dt> - <codeph>base64decode(string str)</codeph> + BASE64DECODE(STRING str) </dt> <dd> - <indexterm audience="hidden">base64decode() function</indexterm> <b>Purpose:</b> <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> + <p> For general information about Base64 encoding, see - <xref keyref="base64"/>. + <xref + keyref="base64"/>. </p> + <p conref="../shared/impala_common.xml#common/base64_use_cases"/> + <p conref="../shared/impala_common.xml#common/base64_charset"/> + <p conref="../shared/impala_common.xml#common/base64_alignment"/> + <p conref="../shared/impala_common.xml#common/base64_error_handling"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p conref="../shared/impala_common.xml#common/base64_examples"/> </dd> @@ -115,24 +281,31 @@ under the License. <dlentry id="base64encode" rev="2.6.0 IMPALA-2107"> <dt> - <codeph>base64encode(string str)</codeph> + BASE64ENCODE(STRING str) </dt> <dd> - <indexterm audience="hidden">base64encode() function</indexterm> <b>Purpose:</b> <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> + <p> For general information about Base64 encoding, see - <xref keyref="base64"/>. + <xref + keyref="base64"/>. </p> + <p conref="../shared/impala_common.xml#common/base64_use_cases"/> + <p conref="../shared/impala_common.xml#common/base64_charset"/> + <p conref="../shared/impala_common.xml#common/base64_alignment"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p conref="../shared/impala_common.xml#common/base64_examples"/> </dd> @@ -141,32 +314,33 @@ under the License. <dlentry rev="2.3.0" id="btrim"> <dt> - <codeph>btrim(string a)</codeph>, - <codeph>btrim(string a, string chars_to_trim)</codeph> + BTRIM(STRING a), BTRIM(STRING a, STRING chars_to_trim) </dt> <dd> - <indexterm audience="hidden">btrim() function</indexterm> - <b>Purpose:</b> Removes all instances of one or more characters - from the start and end of a <codeph>STRING</codeph> value. - By default, removes only spaces. - If a non-<codeph>NULL</codeph> optional second argument is specified, the function removes all - occurrences of characters in that second argument from the beginning and - end of the string. - <p><b>Return type:</b> <codeph>string</codeph></p> + <b>Purpose:</b> Removes all instances of one or more characters from the start and end + of a <codeph>STRING</codeph> value. By default, removes only spaces. If a + non-<codeph>NULL</codeph> optional second argument is specified, the function removes + all occurrences of characters in that second argument from the beginning and end of + the string. + <p> + <b>Return type:</b> <codeph>STRING</codeph> + </p> + <p conref="../shared/impala_common.xml#common/added_in_230"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p> - The following examples show the default <codeph>btrim()</codeph> behavior, - and what changes when you specify the optional second argument. - All the examples bracket the output value with <codeph>[ ]</codeph> - so that you can see any leading or trailing spaces in the <codeph>btrim()</codeph> result. - By default, the function removes and number of both leading and trailing spaces. - When the second argument is specified, any number of occurrences of any - character in the second argument are removed from the start and end of the - input string; in this case, spaces are not removed (unless they are part of the second - argument) and any instances of the characters are not removed if they do not come - right at the beginning or end of the string. + The following examples show the default <codeph>BTRIM()</codeph> behavior, and what + changes when you specify the optional second argument. All the examples bracket the + output value with <codeph>[ ]</codeph> so that you can see any leading or trailing + spaces in the <codeph>BTRIM()</codeph> result. By default, the function removes and + number of both leading and trailing spaces. When the second argument is specified, + any number of occurrences of any character in the second argument are removed from + the start and end of the input string; in this case, spaces are not removed (unless + they are part of the second argument) and any instances of the characters are not + removed if they do not come right at the beginning or end of the string. </p> <codeblock>-- Remove multiple spaces before and one space after. select concat('[',btrim(' hello '),']'); @@ -200,26 +374,27 @@ select concat('[',btrim('xyhelxyzlozyzzxx','xyz'),']'); <dlentry rev="1.3.0" id="char_length"> <dt> - <codeph>char_length(string a), <ph rev="1.3.0" id="character_length">character_length(string a)</ph></codeph> + CHAR_LENGTH(STRING a), CHARACTER_LENGTH(STRING a) </dt> <dd rev="IMPALA-6391 IMPALA-2172"> - <indexterm audience="hidden">char_length() function</indexterm> - <indexterm audience="hidden">character_length() function</indexterm> <b>Purpose:</b> Returns the length in characters of the argument string, including any trailing spaces that pad a <codeph>CHAR</codeph> value. <p> - <b>Return type:</b> <codeph>int</codeph> + <b>Return type:</b> <codeph>INT</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> + <p> - When applied to a <codeph>STRING</codeph> value, it returns the - same result as the <codeph>length()</codeph> function. When applied - to a <codeph>CHAR</codeph> value, it might return a larger value - than <codeph>length()</codeph> does, to account for trailing spaces - in the <codeph>CHAR</codeph>. + When applied to a <codeph>STRING</codeph> value, it returns the same result as the + <codeph>length()</codeph> function. When applied to a <codeph>CHAR</codeph> value, + it might return a larger value than <codeph>length()</codeph> does, to account for + trailing spaces in the <codeph>CHAR</codeph>. </p> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p conref="../shared/impala_common.xml#common/length_demo"/> </dd> @@ -228,23 +403,26 @@ select concat('[',btrim('xyhelxyzlozyzzxx','xyz'),']'); <dlentry rev="2.3.0" id="chr"> <dt> - <codeph>chr(int character_code)</codeph> + CHR(INT character_code) </dt> <dd> - <indexterm audience="hidden">chr() function</indexterm> - <b>Purpose:</b> Returns a character specified by a decimal code point value. - The interpretation and display of the resulting character depends on your system locale. - Because consistent processing of Impala string values is only guaranteed - for values within the ASCII range, only use this function for values - corresponding to ASCII characters. - In particular, parameter values greater than 255 return an empty string. - <p><b>Return type:</b> <codeph>string</codeph></p> - <p> - <b>Usage notes:</b> Can be used as the inverse of the <codeph>ascii()</codeph> function, which - converts a character to its numeric ASCII code. + <b>Purpose:</b> Returns a character specified by a decimal code point value. The + interpretation and display of the resulting character depends on your system locale. + Because consistent processing of Impala string values is only guaranteed for values + within the ASCII range, only use this function for values corresponding to ASCII + characters. In particular, parameter values greater than 255 return an empty string. + <p> + <b>Return type:</b> <codeph>STRING</codeph> + </p> + + <p> + <b>Usage notes:</b> Can be used as the inverse of the <codeph>ascii()</codeph> + function, which converts a character to its numeric ASCII code. </p> + <p conref="../shared/impala_common.xml#common/added_in_230"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <codeblock>SELECT chr(65); +---------+ @@ -267,15 +445,16 @@ SELECT chr(97); <dlentry id="concat"> <dt> - <codeph>concat(string a, string b...)</codeph> + CONCAT(STRING a, STRING b...) </dt> <dd> - <indexterm audience="hidden">concat() function</indexterm> - <b>Purpose:</b> Returns a single string representing all the argument values joined together. + <b>Purpose:</b> Returns a single string representing all the argument values joined + together. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/concat_blurb"/> </dd> @@ -284,16 +463,16 @@ SELECT chr(97); <dlentry id="concat_ws"> <dt> - <codeph>concat_ws(string sep, string a, string b...)</codeph> + CONCAT_WS(STRING sep, STRING a, STRING b...) </dt> <dd> - <indexterm audience="hidden">concat_ws() function</indexterm> - <b>Purpose:</b> Returns a single string representing the second and following argument values joined - together, delimited by a specified separator. + <b>Purpose:</b> Returns a single string representing the second and following argument + values joined together, delimited by a specified separator. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/concat_blurb"/> </dd> @@ -302,16 +481,16 @@ SELECT chr(97); <dlentry id="find_in_set"> <dt> - <codeph>find_in_set(string str, string strList)</codeph> + FIND_IN_SEt(STRING str, STRING strList) </dt> <dd> - <indexterm audience="hidden">find_in_set() function</indexterm> - <b>Purpose:</b> Returns the position (starting from 1) of the first occurrence of a specified string - within a comma-separated string. Returns <codeph>NULL</codeph> if either argument is - <codeph>NULL</codeph>, 0 if the search string is not found, or 0 if the search string contains a comma. + <b>Purpose:</b> Returns the position (starting from 1) of the first occurrence of a + specified string within a comma-separated string. Returns <codeph>NULL</codeph> if + either argument is <codeph>NULL</codeph>, 0 if the search string is not found, or 0 if + the search string contains a comma. <p> - <b>Return type:</b> <codeph>int</codeph> + <b>Return type:</b> <codeph>INT</codeph> </p> </dd> @@ -320,27 +499,30 @@ SELECT chr(97); <dlentry rev="1.2" id="group_concat"> <dt> - <codeph>group_concat(string s [, string sep])</codeph> + GROUP_CONCAT(STRING s [, STRING sep]) </dt> <dd> - <indexterm audience="hidden">group_concat() function</indexterm> - <b>Purpose:</b> Returns a single string representing the argument value concatenated together for each - row of the result set. If the optional separator string is specified, the separator is added between each - pair of concatenated values. + <b>Purpose:</b> Returns a single string representing the argument value concatenated + together for each row of the result set. If the optional separator string is + specified, the separator is added between each pair of concatenated values. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/concat_blurb"/> + <p> - By default, returns a single string covering the whole result set. To include other columns or values - in the result set, or to produce multiple concatenated strings for subsets of rows, include a - <codeph>GROUP BY</codeph> clause in the query. + By default, returns a single string covering the whole result set. To include other + columns or values in the result set, or to produce multiple concatenated strings for + subsets of rows, include a <codeph>GROUP BY</codeph> clause in the query. </p> + <p> - Strictly speaking, <codeph>group_concat()</codeph> is an aggregate function, not a scalar - function like the others in this list. - For additional details and examples, see <xref href="impala_group_concat.xml#group_concat"/>. + Strictly speaking, <codeph>GROUP_CONCAT()</codeph> is an aggregate function, not a + scalar function like the others in this list. For additional details and examples, + see <xref + href="impala_group_concat.xml#group_concat"/>. </p> </dd> @@ -349,14 +531,13 @@ SELECT chr(97); <dlentry rev="1.2" id="initcap"> <dt> - <codeph>initcap(string str)</codeph> + INITCAP(STRING str) </dt> <dd> - <indexterm audience="hidden">initcap() function</indexterm> <b>Purpose:</b> Returns the input string with the first letter capitalized. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -365,22 +546,22 @@ SELECT chr(97); <dlentry id="instr"> <dt> - <codeph>instr(string str, string substr <ph rev="IMPALA-3973">[, bigint position [, bigint occurrence ] ]</ph>)</codeph> + INSTR(STRING str, STRING substr <ph rev="IMPALA-3973">[, BIGINT position [, BIGINT + occurrence ] ]</ph>) </dt> <dd> - <indexterm audience="hidden">instr() function</indexterm> - <b>Purpose:</b> Returns the position (starting from 1) of the first occurrence of a substring within a - longer string. + <b>Purpose:</b> Returns the position (starting from 1) of the first occurrence of a + substring within a longer string. <p> - <b>Return type:</b> <codeph>int</codeph> + <b>Return type:</b> <codeph>INT</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> <p> If the substring is not present in the string, the function returns 0: </p> - <codeblock rev="IMPALA-3973 2.8.0"> select instr('foo bar bletch', 'z'); +------------------------------+ @@ -389,7 +570,6 @@ select instr('foo bar bletch', 'z'); | 0 | +------------------------------+ </codeblock> - <p rev="IMPALA-3973 2.8.0"> The optional third and fourth arguments let you find instances of the substring other than the first instance starting from the left: @@ -397,10 +577,9 @@ select instr('foo bar bletch', 'z'); <ul> <li> <p> - The third argument lets you specify a starting point within the string - other than 1: + The third argument lets you specify a starting point within the string other + than 1: </p> - <codeblock> -- Restricting the search to positions 7..end, -- the first occurrence of 'b' is at position 9. @@ -420,13 +599,11 @@ select instr('foo bar bletch', 'b', 10); | 0 | +----------------------------------+ </codeblock> - <p> - If the third argument is negative, the search works right-to-left - starting that many characters from the right. The return value still - represents the position starting from the left side of the string. + If the third argument is negative, the search works right-to-left starting that + many characters from the right. The return value still represents the position + starting from the left side of the string. </p> - <codeblock rev="IMPALA-3973 2.8.0"> -- Scanning right to left, the first occurrence of 'o' -- is at position 8. (8th character from the left.) @@ -456,14 +633,12 @@ select instr('hello world','o',-10); | 0 | +--------------------------------+ </codeblock> - </li> <li> <p> The fourth argument lets you specify an occurrence other than the first: </p> - <codeblock rev="IMPALA-3973 2.8.0"> -- 2nd occurrence of 'b' is at position 9. select instr('foo bar bletch', 'b', 1, 2); @@ -482,12 +657,10 @@ select instr('foo bar bletch', 'b', -1, 2); | 5 | +-------------------------------------+ </codeblock> - <p> - If the fourth argument is greater than the number of matching occurrences, - the function returns 0: + If the fourth argument is greater than the number of matching occurrences, the + function returns 0: </p> - <codeblock> -- There is no 3rd occurrence within the string. select instr('foo bar bletch', 'b', 1, 3); @@ -506,12 +679,10 @@ select instr('foo bar bletch', 'b', 10, 1); | 0 | +-------------------------------------+ </codeblock> - <p> - The fourth argument cannot be negative or zero. A non-positive value for - this argument causes an error: + The fourth argument cannot be negative or zero. A non-positive value for this + argument causes an error: </p> - <codeblock> select instr('foo bar bletch', 'b', 1, 0); ERROR: UDF ERROR: Invalid occurrence parameter to instr function: 0 @@ -519,15 +690,13 @@ ERROR: UDF ERROR: Invalid occurrence parameter to instr function: 0 select instr('aaaaaaaaa','aa', 1, -1); ERROR: UDF ERROR: Invalid occurrence parameter to instr function: -1 </codeblock> - </li> <li> <p> - If either of the optional arguments is <codeph>NULL</codeph>, - the function also returns <codeph>NULL</codeph>: + If either of the optional arguments is <codeph>NULL</codeph>, the function also + returns <codeph>NULL</codeph>: </p> - <codeblock> select instr('foo bar bletch', 'b', null); +------------------------------------+ @@ -544,51 +713,56 @@ select instr('foo bar bletch', 'b', 1, null); +---------------------------------------+ </codeblock> </li> - </ul> - </dd> </dlentry> + <dlentry id="left"> + <dt> - <codeph>left(string a, int num_chars)</codeph> + LEFT(STRING a, INT num_chars) </dt> + <dd> - See the <codeph>strleft</codeph> function. + See the <codeph>STRLEFT()</codeph> function. </dd> + </dlentry> <dlentry id="length"> <dt> - <codeph>length(string a)</codeph> + LENGTH(STRING a) </dt> <dd rev="IMPALA-6391 IMPALA-2172"> - <indexterm audience="hidden">length() function</indexterm> - <b>Purpose:</b> Returns the length in characters of the argument string, - ignoring any trailing spaces in <codeph>CHAR</codeph> values. + <b>Purpose:</b> Returns the length in characters of the argument string, ignoring any + trailing spaces in <codeph>CHAR</codeph> values. <p> - <b>Return type:</b> <codeph>int</codeph> + <b>Return type:</b> <codeph>INT</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> + <p> - When applied to a <codeph>STRING</codeph> value, it returns the - same result as the <codeph>char_length()</codeph> function. When applied - to a <codeph>CHAR</codeph> value, it might return a smaller value - than <codeph>char_length()</codeph> does, because <codeph>length()</codeph> - ignores any trailing spaces in the <codeph>CHAR</codeph>. + When applied to a <codeph>STRING</codeph> value, it returns the same result as the + <codeph>CHAR_LENGTH()</codeph> function. When applied to a <codeph>CHAR</codeph> + value, it might return a smaller value than <codeph>CHAR_LENGTH()</codeph> does, + because <codeph>LENGTH()</codeph> ignores any trailing spaces in the + <codeph>CHAR</codeph>. </p> + <note> - Because the behavior of <codeph>length()</codeph> with <codeph>CHAR</codeph> - values containing trailing spaces is not standardized across the industry, - when porting code from other database systems, evaluate the behavior of - <codeph>length()</codeph> on the source system and switch to - <codeph>char_length()</codeph> for Impala if necessary. + Because the behavior of <codeph>LENGTH()</codeph> with <codeph>CHAR</codeph> values + containing trailing spaces is not standardized across the industry, when porting + code from other database systems, evaluate the behavior of <codeph>LENGTH()</codeph> + on the source system and switch to <codeph>CHAR_LENGTH()</codeph> for Impala if + necessary. </note> <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p conref="../shared/impala_common.xml#common/length_demo"/> </dd> @@ -597,15 +771,14 @@ select instr('foo bar bletch', 'b', 1, null); <dlentry id="locate"> <dt> - <codeph>locate(string substr, string str[, int pos])</codeph> + LOCATE(STRING substr, STRING str[, INT pos]) </dt> <dd> - <indexterm audience="hidden">locate() function</indexterm> - <b>Purpose:</b> Returns the position (starting from 1) of the first occurrence of a substring within a - longer string, optionally after a particular position. + <b>Purpose:</b> Returns the position (starting from 1) of the first occurrence of a + substring within a longer string, optionally after a particular position. <p> - <b>Return type:</b> <codeph>int</codeph> + <b>Return type:</b> <codeph>INT</codeph> </p> </dd> @@ -614,17 +787,20 @@ select instr('foo bar bletch', 'b', 1, null); <dlentry id="lower"> <dt> - <codeph>lower(string a), <ph id="lcase">lcase(string a)</ph> </codeph> + LOWER(STRING a), <ph id="lcase">LCASE(STRING a)</ph> </dt> <dd> - <indexterm audience="hidden">lower() function</indexterm> <b>Purpose:</b> Returns the argument string converted to all-lowercase. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> - <p conref="../shared/impala_common.xml#common/case_insensitive_comparisons_tip"/> + + <p + conref="../shared/impala_common.xml#common/case_insensitive_comparisons_tip" + /> </dd> </dlentry> @@ -632,16 +808,16 @@ select instr('foo bar bletch', 'b', 1, null); <dlentry id="lpad"> <dt> - <codeph>lpad(string str, int len, string pad)</codeph> + LPAD(STRING str, INT len, STRING pad) </dt> <dd> - <indexterm audience="hidden">lpad() function</indexterm> - <b>Purpose:</b> Returns a string of a specified length, based on the first argument string. If the - specified string is too short, it is padded on the left with a repeating sequence of the characters from - the pad string. If the specified string is too long, it is truncated on the right. + <b>Purpose:</b> Returns a string of a specified length, based on the first argument + string. If the specified string is too short, it is padded on the left with a + repeating sequence of the characters from the pad string. If the specified string is + too long, it is truncated on the right. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -650,16 +826,15 @@ select instr('foo bar bletch', 'b', 1, null); <dlentry id="ltrim"> <dt> - <codeph>ltrim(string a [, string chars_to_trim])</codeph> + LTRIM(STRING a [, STRING chars_to_trim]) </dt> <dd> - <indexterm audience="hidden">ltrim() function</indexterm> - <b>Purpose:</b> Returns the argument string with all occurrences - of characters specified by the second argument removed from - the left side. Removes spaces if the second argument is not specified. + <b>Purpose:</b> Returns the argument string with all occurrences of characters + specified by the second argument removed from the left side. Removes spaces if the + second argument is not specified. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -668,26 +843,29 @@ select instr('foo bar bletch', 'b', 1, null); <dlentry id="parse_url"> <dt> - <codeph>parse_url(string urlString, string partToExtract [, string keyToExtract])</codeph> + PARSE_URL(STRING urlString, STRING partToExtract [, STRING keyToExtract]) </dt> <dd> <indexterm audience="hidden">parse_url() function</indexterm> - <b>Purpose:</b> Returns the portion of a URL corresponding to a specified part. The part argument can be - <codeph>'PROTOCOL'</codeph>, <codeph>'HOST'</codeph>, <codeph>'PATH'</codeph>, <codeph>'REF'</codeph>, - <codeph>'AUTHORITY'</codeph>, <codeph>'FILE'</codeph>, <codeph>'USERINFO'</codeph>, or - <codeph>'QUERY'</codeph>. Uppercase is required for these literal values. When requesting the - <codeph>QUERY</codeph> portion of the URL, you can optionally specify a key to retrieve just the - associated value from the key-value pairs in the query string. + <b>Purpose:</b> Returns the portion of a URL corresponding to a specified part. The + part argument can be <codeph>'PROTOCOL'</codeph>, <codeph>'HOST'</codeph>, + <codeph>'PATH'</codeph>, <codeph>'REF'</codeph>, <codeph>'AUTHORITY'</codeph>, + <codeph>'FILE'</codeph>, <codeph>'USERINFO'</codeph>, or <codeph>'QUERY'</codeph>. + Uppercase is required for these literal values. When requesting the + <codeph>QUERY</codeph> portion of the URL, you can optionally specify a key to + retrieve just the associated value from the key-value pairs in the query string. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p> - <b>Usage notes:</b> This function is important for the traditional Hadoop use case of interpreting web - logs. For example, if the web traffic data features raw URLs not divided into separate table columns, - you can count visitors to a particular page by extracting the <codeph>'PATH'</codeph> or - <codeph>'FILE'</codeph> field, or analyze search terms by extracting the corresponding key from the - <codeph>'QUERY'</codeph> field. + <b>Usage notes:</b> This function is important for the traditional Hadoop use case + of interpreting web logs. For example, if the web traffic data features raw URLs not + divided into separate table columns, you can count visitors to a particular page by + extracting the <codeph>'PATH'</codeph> or <codeph>'FILE'</codeph> field, or analyze + search terms by extracting the corresponding key from the <codeph>'QUERY'</codeph> + field. </p> </dd> @@ -696,26 +874,27 @@ select instr('foo bar bletch', 'b', 1, null); <dlentry rev="3.0" id="regexp_escape"> <dt> - <codeph>regexp_escape(string source)</codeph> + REGEXP_ESCAPE(STRING source) </dt> <dd> - <b>Purpose:</b> The <codeph>regexp_escape</codeph> function returns - a string escaped for the special character in RE2 library so that the - special characters are interpreted literally rather than as special - characters. The following special characters are escaped by the - function: + <b>Purpose:</b> The <codeph>REGEXP_ESCAPE()</codeph> function returns a string escaped + for the special character in RE2 library so that the special characters are + interpreted literally rather than as special characters. The following special + characters are escaped by the function: <codeblock><![CDATA[.\+*?[^]$(){}=!<>|:-]]></codeblock> - <p> - <b>Return type:</b> - <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> <p conref="../shared/impala_common.xml#common/regexp_re2"/> + <p conref="../shared/impala_common.xml#common/regexp_re2_warning"/> + <p conref="../shared/impala_common.xml#common/regexp_escapes"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p> This example shows escaping one of special characters in RE2. </p> @@ -744,24 +923,28 @@ select instr('foo bar bletch', 'b', 1, null); <dlentry id="regexp_extract"> <dt> - <codeph>regexp_extract(string subject, string pattern, int index)</codeph> + REGEXP_EXTRACT(STRING subject, STRING pattern, INT index) </dt> <dd> - <indexterm audience="hidden">regexp_extract() function</indexterm> - <b>Purpose:</b> Returns the specified () group from a string based on a regular expression pattern. Group - 0 refers to the entire extracted string, while group 1, 2, and so on refers to the first, second, and so - on <codeph>(...)</codeph> portion. + <b>Purpose:</b> Returns the specified () group from a string based on a regular + expression pattern. Group 0 refers to the entire extracted string, while group 1, 2, + and so on refers to the first, second, and so on <codeph>(...)</codeph> portion. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/regexp_re2"/> + <p conref="../shared/impala_common.xml#common/regexp_re2_warning"/> + <p conref="../shared/impala_common.xml#common/regexp_escapes"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p> - This example shows how group 0 matches the full pattern string, including the portion outside any - <codeph>()</codeph> group: + This example shows how group 0 matches the full pattern string, including the + portion outside any <codeph>()</codeph> group: </p> <codeblock>[localhost:21000] > select regexp_extract('abcdef123ghi456jkl','.*?(\\d+)',0); +------------------------------------------------------+ @@ -771,8 +954,8 @@ select instr('foo bar bletch', 'b', 1, null); +------------------------------------------------------+ Returned 1 row(s) in 0.11s</codeblock> <p> - This example shows how group 1 matches just the contents inside the first <codeph>()</codeph> group in - the pattern string: + This example shows how group 1 matches just the contents inside the first + <codeph>()</codeph> group in the pattern string: </p> <codeblock>[localhost:21000] > select regexp_extract('abcdef123ghi456jkl','.*?(\\d+)',1); +------------------------------------------------------+ @@ -782,12 +965,13 @@ Returned 1 row(s) in 0.11s</codeblock> +------------------------------------------------------+ Returned 1 row(s) in 0.11s</codeblock> <p rev="2.0.0"> - Unlike in earlier Impala releases, the regular expression library used in Impala 2.0 and later supports - the <codeph>.*?</codeph> idiom for non-greedy matches. This example shows how a pattern string starting - with <codeph>.*?</codeph> matches the shortest possible portion of the source string, returning the - rightmost set of lowercase letters. A pattern string both starting and ending with <codeph>.*?</codeph> - finds two potential matches of equal length, and returns the first one found (the leftmost set of - lowercase letters). + Unlike in earlier Impala releases, the regular expression library used in Impala 2.0 + and later supports the <codeph>.*?</codeph> idiom for non-greedy matches. This + example shows how a pattern string starting with <codeph>.*?</codeph> matches the + shortest possible portion of the source string, returning the rightmost set of + lowercase letters. A pattern string both starting and ending with + <codeph>.*?</codeph> finds two potential matches of equal length, and returns the + first one found (the leftmost set of lowercase letters). </p> <codeblock>[localhost:21000] > select regexp_extract('AbcdBCdefGHI','.*?([[:lower:]]+)',1); +--------------------------------------------------------+ @@ -809,48 +993,58 @@ Returned 1 row(s) in 0.11s</codeblock> <dlentry rev="2.3.0" id="regexp_like"> <dt> - <codeph>regexp_like(string source, string pattern[, string options])</codeph> + REGEXP_LIKE(STRING source, STRING pattern[, STRING options]) </dt> <dd> - <indexterm audience="hidden">regexp_like() function</indexterm> <b>Purpose:</b> Returns <codeph>true</codeph> or <codeph>false</codeph> to indicate - whether the source string contains anywhere inside it the regular expression given by the pattern. - The optional third argument consists of letter flags that change how the match is performed, - such as <codeph>i</codeph> for case-insensitive matching. - <p conref="../shared/impala_common.xml#common/syntax_blurb"/> + whether the source string contains anywhere inside it the regular expression given by + the pattern. The optional third argument consists of letter flags that change how the + match is performed, such as <codeph>i</codeph> for case-insensitive matching. + <p + conref="../shared/impala_common.xml#common/syntax_blurb"/> + <p> The flags that you can include in the optional third argument are: </p> <ul> - <li> - <codeph>c</codeph>: Case-sensitive matching (the default). - </li> - <li> - <codeph>i</codeph>: Case-insensitive matching. If multiple instances of <codeph>c</codeph> and <codeph>i</codeph> - are included in the third argument, the last such option takes precedence. - </li> - <li> - <codeph>m</codeph>: Multi-line matching. The <codeph>^</codeph> and <codeph>$</codeph> - operators match the start or end of any line within the source string, not the - start and end of the entire string. - </li> - <li> - <codeph>n</codeph>: Newline matching. The <codeph>.</codeph> operator can match the - newline character. A repetition operator such as <codeph>.*</codeph> can - match a portion of the source string that spans multiple lines. - </li> + <li> + <codeph>c</codeph>: Case-sensitive matching (the default). + </li> + + <li> + <codeph>i</codeph>: Case-insensitive matching. If multiple instances of + <codeph>c</codeph> and <codeph>i</codeph> are included in the third argument, the + last such option takes precedence. + </li> + + <li> + <codeph>m</codeph>: Multi-line matching. The <codeph>^</codeph> and + <codeph>$</codeph> operators match the start or end of any line within the source + string, not the start and end of the entire string. + </li> + + <li> + <codeph>n</codeph>: Newline matching. The <codeph>.</codeph> operator can match + the newline character. A repetition operator such as <codeph>.*</codeph> can match + a portion of the source string that spans multiple lines. + </li> </ul> <p> - <b>Return type:</b> <codeph>boolean</codeph> + <b>Return type:</b> <codeph>BOOLEAN</codeph> </p> + <p conref="../shared/impala_common.xml#common/regexp_re2"/> + <p conref="../shared/impala_common.xml#common/regexp_re2_warning"/> + <p conref="../shared/impala_common.xml#common/regexp_escapes"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p> - This example shows how <codeph>regexp_like()</codeph> can test for the existence - of various kinds of regular expression patterns within a source string: + This example shows how <codeph>regexp_like()</codeph> can test for the existence of + various kinds of regular expression patterns within a source string: </p> <codeblock><![CDATA[ -- Matches because the 'f' appears somewhere in 'foo'. @@ -915,26 +1109,31 @@ select regexp_like('foooooobar','fx*y*o*b'); <dlentry id="regexp_replace"> <dt> - <codeph>regexp_replace(string initial, string pattern, string replacement)</codeph> + REGEXP_REPLACE(STRING initial, STRING pattern, STRING replacement) </dt> <dd> - <indexterm audience="hidden">regexp_replace() function</indexterm> - <b>Purpose:</b> Returns the initial argument with the regular expression pattern replaced by the final - argument string. + <b>Purpose:</b> Returns the initial argument with the regular expression pattern + replaced by the final argument string. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/regexp_re2"/> + <p conref="../shared/impala_common.xml#common/regexp_re2_warning"/> + <p conref="../shared/impala_common.xml#common/regexp_escapes"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p> - These examples show how you can replace parts of a string matching a pattern with replacement text, - which can include backreferences to any <codeph>()</codeph> groups in the pattern string. The - backreference numbers start at 1, and any <codeph>\</codeph> characters must be escaped as - <codeph>\\</codeph>. + These examples show how you can replace parts of a string matching a pattern with + replacement text, which can include backreferences to any <codeph>()</codeph> groups + in the pattern string. The backreference numbers start at 1, and any + <codeph>\</codeph> characters must be escaped as <codeph>\\</codeph>. </p> + <p> Replace a character pattern with new text: </p> @@ -946,7 +1145,8 @@ select regexp_like('foooooobar','fx*y*o*b'); +------------------------------------------+ Returned 1 row(s) in 0.11s</codeblock> <p> - Replace a character pattern with substitution text that includes the original matching text: + Replace a character pattern with substitution text that includes the original + matching text: </p> <codeblock>[localhost:21000] > select regexp_replace('aaabbbaaa','(b+)','<\\1>'); +----------------------------------------------+ @@ -972,14 +1172,13 @@ Returned 1 row(s) in 0.12s</codeblock> <dlentry id="repeat"> <dt> - <codeph>repeat(string str, int n)</codeph> + REPEAT(STRING str, INT n) </dt> <dd> - <indexterm audience="hidden">repeat() function</indexterm> <b>Purpose:</b> Returns the argument string repeated a specified number of times. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -988,33 +1187,39 @@ Returned 1 row(s) in 0.12s</codeblock> <dlentry id="replace" rev="2.9.0 IMPALA-4729"> <dt> - <codeph>replace(string initial, string target, string replacement)</codeph> + REPLACE(STRING initial, STRING target, STRING replacement) </dt> <dd> - <indexterm audience="hidden">replace() function</indexterm> <b>Purpose:</b> Returns the initial argument with all occurrences of the target string replaced by the replacement string. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> + <p> - Because this function does not use any regular expression patterns, it is typically faster - than <codeph>regexp_replace()</codeph> for simple string substitutions. + Because this function does not use any regular expression patterns, it is typically + faster than <codeph>regexp_replace()</codeph> for simple string substitutions. </p> + <p> If any argument is <codeph>NULL</codeph>, the return value is <codeph>NULL</codeph>. </p> + <p> Matching is case-sensitive. </p> + <p> - If the replacement string contains another instance of the target - string, the expansion is only performed once, instead of - applying again to the newly constructed string. + If the replacement string contains another instance of the target string, the + expansion is only performed once, instead of applying again to the newly constructed + string. </p> + <p conref="../shared/impala_common.xml#common/added_in_290"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <codeblock>-- Replace one string with another. select replace('hello world','world','earth'); @@ -1047,40 +1252,43 @@ select replace('hello world','xyz','abc'); <dlentry id="reverse"> <dt> - <codeph>reverse(string a)</codeph> + REVERSE(STRING a) </dt> <dd> - <indexterm audience="hidden">reverse() function</indexterm> <b>Purpose:</b> Returns the argument string with characters in reversed order. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> </dlentry> + <dlentry id="right"> + <dt> - <codeph>right(string a, int num_chars)</codeph> + RIGHT(STRING a, INT num_chars) </dt> + <dd> - See the <codeph>strright</codeph> function. + See the <codeph>STRRIGHT()</codeph> function. </dd> + </dlentry> <dlentry id="rpad"> <dt> - <codeph>rpad(string str, int len, string pad)</codeph> + RPAD(STRING str, INT len, STRING pad) </dt> <dd> - <indexterm audience="hidden">rpad() function</indexterm> - <b>Purpose:</b> Returns a string of a specified length, based on the first argument string. If the - specified string is too short, it is padded on the right with a repeating sequence of the characters from - the pad string. If the specified string is too long, it is truncated on the right. + <b>Purpose:</b> Returns a string of a specified length, based on the first argument + string. If the specified string is too short, it is padded on the right with a + repeating sequence of the characters from the pad string. If the specified string is + too long, it is truncated on the right. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -1089,16 +1297,15 @@ select replace('hello world','xyz','abc'); <dlentry id="rtrim"> <dt> - <codeph>rtrim(string a [, string chars_to_trim])</codeph> + RTRIM(STRING a [, STRING chars_to_trim]) </dt> <dd> - <indexterm audience="hidden">rtrim() function</indexterm> - <b>Purpose:</b> Returns the argument string with all occurrences - of characters specified by the second argument removed from - the right side. Removes spaces if the second argument is not specified. + <b>Purpose:</b> Returns the argument string with all occurrences of characters + specified by the second argument removed from the right side. Removes spaces if the + second argument is not specified. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -1107,15 +1314,14 @@ select replace('hello world','xyz','abc'); <dlentry id="space"> <dt> - <codeph>space(int n)</codeph> + SPACE(INT n) </dt> <dd> - <indexterm audience="hidden">space() function</indexterm> - <b>Purpose:</b> Returns a concatenated string of the specified number of spaces. Shorthand for - <codeph>repeat(' ',<varname>n</varname>)</codeph>. + <b>Purpose:</b> Returns a concatenated string of the specified number of spaces. + Shorthand for <codeph>REPEAT(' ',<varname>n</varname>)</codeph>. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -1124,25 +1330,28 @@ select replace('hello world','xyz','abc'); <dlentry rev="2.3.0 IMPALA-2084" id="split_part"> <dt> - <codeph>split_part(string source, string delimiter, bigint n)</codeph> + SPLIT_PART(STRING source, STRING delimiter, BIGINT n) </dt> <dd> - <indexterm audience="hidden">split_part() function</indexterm> - <b>Purpose:</b> Returns the nth field within a delimited string. The - fields are numbered starting from 1. The delimiter can consist of - multiple characters, not just a single character. All matching of the - delimiter is done exactly, not using any regular expression patterns. + <b>Purpose:</b> Returns the nth field within a delimited string. The fields are + numbered starting from 1. The delimiter can consist of multiple characters, not just a + single character. All matching of the delimiter is done exactly, not using any regular + expression patterns. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/regexp_re2"/> + <p conref="../shared/impala_common.xml#common/regexp_re2_warning"/> + <p conref="../shared/impala_common.xml#common/regexp_escapes"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p> - These examples show how to retrieve the nth field from a delimited - string: + These examples show how to retrieve the nth field from a delimited string: </p> <codeblock><![CDATA[ select split_part('x,y,z',',',1); @@ -1168,10 +1377,10 @@ select split_part('x,y,z',',',3); ]]> </codeblock> <p> - These examples show what happens for out-of-range field positions. - Specifying a value less than 1 produces an error. Specifying a value - greater than the number of fields returns a zero-length string - (which is not the same as <codeph>NULL</codeph>). + These examples show what happens for out-of-range field positions. Specifying a + value less than 1 produces an error. Specifying a value greater than the number of + fields returns a zero-length string (which is not the same as + <codeph>NULL</codeph>). </p> <codeblock><![CDATA[ select split_part('x,y,z',',',0); @@ -1216,32 +1425,30 @@ select split_part('one\|/two\|/three','\|/',3); <dlentry id="strleft"> <dt> - <codeph>strleft(string a, int num_chars)</codeph> + STRLEFT(STRING a, INT num_chars) </dt> <dd> - <indexterm audience="hidden">strleft() function</indexterm> <b>Purpose:</b> Returns the leftmost characters of the string. Shorthand for a call to - <codeph>substr()</codeph> with 2 arguments. + <codeph>SUBSTR()</codeph> with 2 arguments. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> - </dd> + </dlentry> <dlentry id="strright"> <dt> - <codeph>strright(string a, int num_chars)</codeph> + STRRIGHT(STRING a, INT num_chars) </dt> <dd> - <indexterm audience="hidden">strright() function</indexterm> - <b>Purpose:</b> Returns the rightmost characters of the string. Shorthand for a call to - <codeph>substr()</codeph> with 2 arguments. + <b>Purpose:</b> Returns the rightmost characters of the string. Shorthand for a call + to <codeph>SUBSTR()</codeph> with 2 arguments. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -1250,16 +1457,17 @@ select split_part('one\|/two\|/three','\|/',3); <dlentry id="substr"> <dt> - <codeph>substr(string a, int start [, int len]), <ph id="substring">substring(string a, int start [, int - len])</ph></codeph> + SUBSTR(STRING a, INT start [, INT len]), + <ph id="substring" + >SUBSTRING(STRING a, INT start [, INT len])</ph> </dt> <dd> - <indexterm audience="hidden">substr() function</indexterm> - <b>Purpose:</b> Returns the portion of the string starting at a specified point, optionally with a - specified maximum length. The characters in the string are indexed starting at 1. + <b>Purpose:</b> Returns the portion of the string starting at a specified point, + optionally with a specified maximum length. The characters in the string are indexed + starting at 1. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -1268,43 +1476,65 @@ select split_part('one\|/two\|/three','\|/',3); <dlentry id="translate"> <dt> - <codeph>translate(string input, string from, string to)</codeph> + TRANSLATE(STRING input, STRING from, STRING to) </dt> <dd> - <b>Purpose:</b> Returns the <codeph>input</codeph> string with each - character in the <codeph>from</codeph> argument replaced with the - corresponding character in the <codeph>to</codeph> argument. The - characters are matched in the order they appear in - <codeph>from</codeph> and <codeph>to</codeph>. <p> For example: - <codeph>translate ('hello world','world','earth')</codeph> returns - <codeph>'hetta earth'</codeph>. </p> + <b>Purpose:</b> Returns the <codeph>input</codeph> string with each character in the + <codeph>from</codeph> argument replaced with the corresponding character in the + <codeph>to</codeph> argument. The characters are matched in the order they appear in + <codeph>from</codeph> and <codeph>to</codeph>. + <p> + For example: <codeph>translate ('hello world','world','earth')</codeph> returns + <codeph>'hetta earth'</codeph>. + </p> + <p> - <b>Return type:</b> - <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p> <b>Usage notes:</b> </p> - <p> If <codeph>from</codeph> contains more characters than - <codeph>to</codeph>, the <codeph>from</codeph> characters that are - beyond the length of <codeph>to</codeph> are removed in the result. </p> - <p> For example: </p> - <p> - <codeph>translate('abcdedg', 'bcd', '1')</codeph> returns - <codeph>'a1eg'</codeph>. </p> - <p><codeph>translate('Unit Number#2', '# ', '_')</codeph> returns - <codeph>'UnitNumber_2'</codeph>. </p> - <p> If <codeph>from</codeph> is <codeph>NULL</codeph>, the function - returns <codeph>NULL</codeph>. </p> - <p> If <codeph>to</codeph> contains more characters than - <codeph>from</codeph>, the extra characters in <codeph>to</codeph> - are ignored. </p> - <p> If <codeph>from</codeph> contains duplicate characters, the - duplicate character is replaced with the first matching character in - <codeph>to</codeph>. </p> - <p> For example: <codeph>translate ('hello','ll','67')</codeph> - returns <codeph>'he66o'</codeph>. </p> + + <p> + If <codeph>from</codeph> contains more characters than <codeph>to</codeph>, the + <codeph>from</codeph> characters that are beyond the length of <codeph>to</codeph> + are removed in the result. + </p> + + <p> + For example: + </p> + + <p> + <codeph>translate('abcdedg', 'bcd', '1')</codeph> returns <codeph>'a1eg'</codeph>. + </p> + + <p> + <codeph>translate('Unit Number#2', '# ', '_')</codeph> returns + <codeph>'UnitNumber_2'</codeph>. + </p> + + <p> + If <codeph>from</codeph> is <codeph>NULL</codeph>, the function returns + <codeph>NULL</codeph>. + </p> + + <p> + If <codeph>to</codeph> contains more characters than <codeph>from</codeph>, the + extra characters in <codeph>to</codeph> are ignored. + </p> + + <p> + If <codeph>from</codeph> contains duplicate characters, the duplicate character is + replaced with the first matching character in <codeph>to</codeph>. + </p> + + <p> + For example: <codeph>translate ('hello','ll','67')</codeph> returns + <codeph>'he66o'</codeph>. + </p> </dd> </dlentry> @@ -1312,19 +1542,22 @@ select split_part('one\|/two\|/three','\|/',3); <dlentry id="trim"> <dt> - <codeph>trim(string a)</codeph> + TRIM(STRING a) </dt> <dd> - <indexterm audience="hidden">trim() function</indexterm> - <b>Purpose:</b> Returns the input string with both leading and trailing spaces removed. The same as - passing the string through both <codeph>ltrim()</codeph> and <codeph>rtrim()</codeph>. + <b>Purpose:</b> Returns the input string with both leading and trailing spaces + removed. The same as passing the string through both <codeph>LTRIM()</codeph> and + <codeph>RTRIM()</codeph>. <p> - <b>Usage notes:</b> Often used during data cleansing operations during the ETL cycle, if input values might still have surrounding spaces. - For a more general-purpose function that can remove other leading and trailing characters besides spaces, see <codeph>btrim()</codeph>. + <b>Usage notes:</b> Often used during data cleansing operations during the ETL + cycle, if input values might still have surrounding spaces. For a more + general-purpose function that can remove other leading and trailing characters + besides spaces, see <codeph>BTRIM()</codeph>. </p> + <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> </dd> @@ -1333,21 +1566,25 @@ select split_part('one\|/two\|/three','\|/',3); <dlentry id="upper"> <dt> - <codeph>upper(string a), <ph id="ucase">ucase(string a)</ph></codeph> + UPPER(STRING a), <ph id="ucase">UCASE(STRING a)</ph> </dt> <dd> - <indexterm audience="hidden">upper() function</indexterm> - <indexterm audience="hidden">ucase() function</indexterm> <b>Purpose:</b> Returns the argument string converted to all-uppercase. <p> - <b>Return type:</b> <codeph>string</codeph> + <b>Return type:</b> <codeph>STRING</codeph> </p> + <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> - <p conref="../shared/impala_common.xml#common/case_insensitive_comparisons_tip"/> + + <p + conref="../shared/impala_common.xml#common/case_insensitive_comparisons_tip" + /> </dd> </dlentry> </dl> + </conbody> + </concept>