Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/LanguageManual/UDF" page has been changed by PaulYang. http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=diff&rev1=27&rev2=28 -------------------------------------------------- == Built in Operators == === Relational Operators === The following operators compare the passed operands and generate a TRUE or FALSE value depending on whether the comparison between the operands holds or not. - ||<10%>'''Operator'''||<10%>'''Operand types'''||'''Description'''|| + ||<10%>'''Operator''' ||<10%>'''Operand types''' ||'''Description''' || - || A = B || all primitive types || TRUE if expression A is equal to expression B otherwise FALSE || + ||A = B ||all primitive types ||TRUE if expression A is equal to expression B otherwise FALSE || - ||A == B||none!||Fails; SQL uses = and not ==!|| + ||A == B ||none! ||Fails; SQL uses = and not ==! || - ||A <> B||all primitive types||TRUE if expression A is NOT equal to expression B otherwise FALSE|| + ||A <> B ||all primitive types ||TRUE if expression A is NOT equal to expression B otherwise FALSE || - ||A < B||all primitive types||TRUE if expression A is less than expression B otherwise FALSE|| + ||A < B ||all primitive types ||TRUE if expression A is less than expression B otherwise FALSE || - ||A <= B||all primitive types||TRUE if expression A is less than or equal to expression B otherwise FALSE|| + ||A <= B ||all primitive types ||TRUE if expression A is less than or equal to expression B otherwise FALSE || - ||A > B||all primitive types||TRUE if expression A is greater than expression B otherwise FALSE|| + ||A > B ||all primitive types ||TRUE if expression A is greater than expression B otherwise FALSE || - ||A >= B||all primitive types||TRUE if expression A is greater than or equal to expression B otherwise FALSE|| + ||A >= B ||all primitive types ||TRUE if expression A is greater than or equal to expression B otherwise FALSE || - ||A IS NULL||all types||TRUE if expression A evaluates to NULL otherwise FALSE|| + ||A IS NULL ||all types ||TRUE if expression A evaluates to NULL otherwise FALSE || - ||A IS NOT NULL||all types||TRUE if expression A evaluates to NULL otherwise FALSE|| + ||A IS NOT NULL ||all types ||TRUE if expression A evaluates to NULL otherwise FALSE || - ||A LIKE B||strings||TRUE if string A matches the SQL simple regular expression B, otherwise FALSE. The comparison is done character by character. The _ character in B matches any character in A(similar to . in posix regular expressions) while the % character in B matches an arbitrary number of characters in A(similar to .* in posix regular expressions) e.g. 'foobar' like 'foo' evaluates to FALSE where as 'foobar' like 'foo_ _ _' evaluates to TRUE and so does 'foobar' like 'foo%'|| + ||A LIKE B ||strings ||TRUE if string A matches the SQL simple regular expression B, otherwise FALSE. The comparison is done character by character. The _ character in B matches any character in A(similar to . in posix regular expressions) while the % character in B matches an arbitrary number of characters in A(similar to .* in posix regular expressions) e.g. 'foobar' like 'foo' evaluates to FALSE where as 'foobar' like 'foo_ _ _' evaluates to TRUE and so does 'foobar' like 'foo%' || - ||A RLIKE B||strings||TRUE if string A matches the Java regular expression B(See Java regular expressions syntax), otherwise FALSE e.g. 'foobar' rlike 'foo' evaluates to FALSE where as 'foobar' rlike '^f.*r$' evaluates to TRUE|| + ||A RLIKE B ||strings ||TRUE if string A matches the Java regular expression B(See Java regular expressions syntax), otherwise FALSE e.g. 'foobar' rlike 'foo' evaluates to FALSE where as 'foobar' rlike '^f.*r$' evaluates to TRUE || - ||A REGEXP B||strings||Same as RLIKE|| + ||A REGEXP B ||strings ||Same as RLIKE || + + === Arithmetic Operators === The following operators support various common arithmetic operations on the operands. All of them return number types. - ||<10%>'''Operator'''||<10%>'''Operand types'''||'''Description'''|| + ||<10%>'''Operator''' ||<10%>'''Operand types''' ||'''Description''' || - ||A + B ||all number types ||Gives the result of adding A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. e.g. since every integer is a float, therefore float is a containing type of integer so the + operator on a float and an int will result in a float.|| + ||A + B ||all number types ||Gives the result of adding A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. e.g. since every integer is a float, therefore float is a containing type of integer so the + operator on a float and an int will result in a float. || - ||A - B ||all number types ||Gives the result of subtracting B from A. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands.|| + ||A - B ||all number types ||Gives the result of subtracting B from A. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. || - ||A * B ||all number types ||Gives the result of multiplying A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. Note that if the multiplication causing overflow, you will have to cast one of the operators to a type higher in the type hierarchy.|| + ||A * B ||all number types ||Gives the result of multiplying A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. Note that if the multiplication causing overflow, you will have to cast one of the operators to a type higher in the type hierarchy. || - ||A / B ||all number types ||Gives the result of dividing B from A. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. If the operands are integer types, then the result is the quotient of the division.|| + ||A / B ||all number types ||Gives the result of dividing B from A. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. If the operands are integer types, then the result is the quotient of the division. || - ||A % B ||all number types ||Gives the reminder resulting from dividing A by B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands.|| + ||A % B ||all number types ||Gives the reminder resulting from dividing A by B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. || - ||A & B ||all number types ||Gives the result of bitwise AND of A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands.|| + ||A & B ||all number types ||Gives the result of bitwise AND of A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. || - ||A | B ||all number types ||Gives the result of bitwise OR of A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands.|| + ||A | B ||all number types ||Gives the result of bitwise OR of A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. || - ||A ^ B ||all number types ||Gives the result of bitwise XOR of A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands.|| + ||A ^ B ||all number types ||Gives the result of bitwise XOR of A and B. The type of the result is the same as the common parent(in the type hierarchy) of the types of the operands. || ||~A ||all number types ||Gives the result of bitwise NOT of A. The type of the result is the same as the type of A. || + + + === Logical Operators === The following operators provide support for creating logical expressions. All of them return boolean TRUE or FALSE depending upon the boolean values of the operands. - ||<10%>'''Operator'''||<10%>'''Operand types'''||'''Description|| + ||<10%>'''Operator''' ||<10%>'''Operand types''' ||'''Description''' || - ||A AND B ||boolean ||TRUE if both A and B are TRUE, otherwise FALSE|| + ||A AND B ||boolean ||TRUE if both A and B are TRUE, otherwise FALSE || - ||A && B ||boolean ||Same as A AND B|| + ||A && B ||boolean ||Same as A AND B || - ||A OR B ||boolean ||TRUE if either A or B or both are TRUE, otherwise FALSE|| + ||A OR B ||boolean ||TRUE if either A or B or both are TRUE, otherwise FALSE || - ||A | | B ||boolean ||Same as A OR B|| + ||A | | B ||boolean ||Same as A OR B || - ||NOT A ||boolean ||TRUE if A is FALSE, otherwise FALSE|| + ||NOT A ||boolean ||TRUE if A is FALSE, otherwise FALSE || ||! A ||boolean ||Same as NOT A || + + + === Operators on Complex Types === The following operators provide mechanisms to access elements in Complex Types - ||<10%>'''Operator'''||<10%>'''Operand types'''||'''Description'''|| + ||<10%>Operator ||<10%>Operand types ||Description || - ||A[n] ||A is an Array and n is an int ||returns the nth element in the array A. The first element has index 0 e.g. if A is an array comprising of ['foo', 'bar'] then A[0] returns 'foo' and A[1] returns 'bar'|| + ||A[n] ||A is an Array and n is an int ||Returns the nth element in the array A. The first element has index 0 e.g. if A is an array comprising of ['foo', 'bar'] then A[0] returns 'foo' and A[1] returns 'bar' || - ||M[key] ||M is a Map<K, V> and key has type K ||returns the value corresponding to the key in the map e.g. if M is a map comprising of {'f' -> 'foo', 'b' -> 'bar', 'all' -> 'foobar'} then M['all'] returns 'foobar'|| + ||M[key] ||M is a Map<K, V> and key has type K ||Returns the value corresponding to the key in the map e.g. if M is a map comprising of {'f' -> 'foo', 'b' -> 'bar', 'all' -> 'foobar'} then M['all'] returns 'foobar' || - ||S.x ||S is a struct ||returns the x field of S e.g for struct foobar {int foo, int bar} foobar.foo returns the integer stored in the foo field of the struct. || + ||S.x ||S is a struct ||Returns the x field of S e.g for struct foobar {int foo, int bar} foobar.foo returns the integer stored in the foo field of the struct. || + + + == Built-in Functions == === Mathematical Functions === The following built-in mathematical functions are supported in hive: - ||<10%>'''Return Type'''||<15%>'''Name(Signature)'''||'''Description|| + ||<10%>Return Type ||<15%>Name(Signature) ||Description || - ||BIGINT ||round(double a) ||returns the rounded BIGINT value of the double|| + ||BIGINT ||round(double a) ||Returns the rounded BIGINT value of the double || - ||BIGINT ||floor(double a) ||returns the maximum BIGINT value that is equal or less than the double|| + ||BIGINT ||floor(double a) ||Returns the maximum BIGINT value that is equal or less than the double || - ||BIGINT ||ceil(double a), ceiling(double a) ||returns the minimum BIGINT value that is equal or greater than the double|| + ||BIGINT ||ceil(double a), ceiling(double a) ||Returns the minimum BIGINT value that is equal or greater than the double || - ||double ||rand(), rand(int seed) ||returns a random number (that changes from row to row). Specifiying the seed will make sure the generated random number sequence is deterministic.|| + ||double ||rand(), rand(int seed) ||Returns a random number (that changes from row to row). Specifiying the seed will make sure the generated random number sequence is deterministic. || - ||double ||exp(double a) ||Return e^a where e is the base of the natural logarithm|| + ||double ||exp(double a) ||Return e^a where e is the base of the natural logarithm || - ||double ||ln(double a) ||Return the natural logarithm of the argument|| + ||double ||ln(double a) ||Return the natural logarithm of the argument || - ||double ||log10(double a) ||Return the base-10 logarithm of the argument|| + ||double ||log10(double a) ||Return the base-10 logarithm of the argument || - ||double ||log2(double a) ||Return the base-2 logarithm of the argument|| + ||double ||log2(double a) ||Return the base-2 logarithm of the argument || - ||double ||log(double base, double a) ||Return the base "base" logarithm of the argument|| + ||double ||log(double base, double a) ||Return the base "base" logarithm of the argument || - ||double ||pow(double a, double p) ||Return a^p|| + ||double ||pow(double a, double p) power(double a, double p) ||Return a^p || - ||double ||power(double a, double p) ||Synonym of pow|| - ||double ||sqrt(double a) ||Return the square root of a|| + ||double ||sqrt(double a) ||Return the square root of a || - ||string ||bin(BIGINT a) ||Return the number in binary format (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_bin|| MySQL bin]]])|| + ||string ||bin(BIGINT a) ||Return the number in binary format (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_bin]]]) || - ||string ||hex(BIGINT a) ||Return the number in hex format (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_hex|| MySQL hex]]]) || + ||string ||hex(BIGINT a) hex(string a) ||If the argument is a int, hex returns the number as a string in hex format. Otherwise if the number is a string, it converts each character into its hex representation and returns the resulting string. (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_hex]]]) || + ||string||unhex(string a)||Inverse of hex. Interprets each pair of characters as a hexidecimal number and converts to the character represented by the number.|| - ||string ||conv(BIGINT num, int from_base, int to_base) ||Convert a number from a given base to another (see [[[http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_conv|| MySQL conv]]])|| + ||string ||conv(BIGINT num, int from_base, int to_base) ||Convert a number from a given base to another (see [[[http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_conv]]]) || - ||double ||abs(double a) ||Return the absolute value|| + ||double ||abs(double a) ||Return the absolute value || + ||int double ||pmod(int a, int b) pmod(double a, double b) ||Returns the positive value of a mod b|| + ||double||sin(double a)||Returns the sine of a (a is in radians)|| + ||double||asin(double a)||Returns the arc sin of x if -1<=a<=1 or null otherwise.|| + ||double||cos(double a)||Returns the cosine of a (a is in radians)|| + ||double||acos(double a)||Returns the arc cosine of x if -1<=a<=1 or null otherwise|| + + + === Collection Functions === The following built-in collection functions are supported in hive: - ||int ||size(Map<K.V>) ||returns the number of elements in the map type|| + ||int ||size(Map<K.V>) ||returns the number of elements in the map type || - ||int ||size(Array<T>) ||returns the number of elements in the array type|| + ||int ||size(Array<T>) ||returns the number of elements in the array type || + + + === Type Conversion Functions === The following type conversion functions are supported in hive: - || <type> ||cast(expr as <type>) ||converts the results of the expression expr to <type> e.g. cast('1' as BIGINT) will convert the string '1' to it integral representation. A null is returned if the conversion does not succeed.|| + ||<type> ||cast(expr as <type>) ||converts the results of the expression expr to <type> e.g. cast('1' as BIGINT) will convert the string '1' to it integral representation. A null is returned if the conversion does not succeed. || + + + === Date Functions === The following built-in date functions are supported in hive: - ||string ||from_unixtime(int unixtime) ||convert the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of "1970-01-01 00:00:00"|| + ||string ||from_unixtime(int unixtime) ||convert the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of "1970-01-01 00:00:00" || - ||bigint ||unix_timestamp() || Get current time stamp. It will use the default time zone.|| + ||bigint ||unix_timestamp() ||Get current time stamp. It will use the default time zone. || - ||bigint ||unix_timestamp(string date) || Convert time string in format `yyyy-MM-dd HH:mm:ss` to Unix time stamp, return 0 if fail: unix_timestamp('2009-03-20 11:30:01') = 1237573801|| + ||bigint ||unix_timestamp(string date) ||Convert time string in format `yyyy-MM-dd HH:mm:ss` to Unix time stamp, return 0 if fail: unix_timestamp('2009-03-20 11:30:01') = 1237573801 || - ||bigint ||unix_timestamp(string date, string pattern) || Convert time string with given pattern (see [[[http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html|| java date and time pattern]]]) to Unix time stamp, return 0 if fail: unix_timestamp('2009-03-20', 'yyyy-MM-dd') = 1237532400 || + ||bigint ||unix_timestamp(string date, string pattern) ||Convert time string with given pattern (see [[[http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html]]]) to Unix time stamp, return 0 if fail: unix_timestamp('2009-03-20', 'yyyy-MM-dd') = 1237532400 || - ||string ||to_date(string timestamp) ||Return the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01"|| + ||string ||to_date(string timestamp) ||Return the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01" || - ||int ||year(string date) ||Return the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970|| + ||int ||year(string date) ||Return the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970 || - ||int ||month(string date) ||Return the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11|| + ||int ||month(string date) ||Return the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11 || ||int ||day(string date) ||Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1 || ||int ||weekofyear(string date) ||Return the week number of a timestamp string: weekofyear("1970-11-01 00:00:00") = 44, weekofyear("1970-11-01") = 44 || ||int ||datediff(string enddate, string startdate) ||Return the number of days from startdate to enddate: datediff('2009-03-01', '2009-02-27') = 2 || ||int ||date_add(string startdate, int days) ||Add a number of days to startdate: date_add('2008-12-31', 1) = '2009-01-01' || ||int ||date_sub(string startdate, int days) ||Substract a number of days to startdate: date_sub('2008-12-31', 1) = '2008-12-30' || + + + === Conditional Functions === ||T ||if(boolean testCondition, T valueTrue, T valueFalseOrNull) ||Return valueTrue when testCondition is true, returns valueFalseOrNull otherwise || ||T ||COALESCE(T v1, T v2, ...) ||Return the first v that is not NULL, or NULL if all v's are NULL || - ||T ||CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END || When a = b, returns c; when a = d, return e; else return f|| + ||T ||CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END ||When a = b, returns c; when a = d, return e; else return f || - ||T ||CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END || When a = true, returns b; when c = true, return d; else return e|| + ||T ||CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END ||When a = true, returns b; when c = true, return d; else return e || + === String Functions === The following are built-in String functions are supported in hive: - || int || length(string A) || returns the length of the string || + ||int ||length(string A) ||returns the length of the string || - ||string || reverse(string A) || returns the reversed string || + ||string ||reverse(string A) ||returns the reversed string || - ||string ||concat(string A, string B...)||returns the string resulting from concatenating the strings passed in as parameters in order. e.g. concat('foo', 'bar') results in 'foobar'. Note that this function can take any number of input strings.|| + ||string ||concat(string A, string B...) ||returns the string resulting from concatenating the strings passed in as parameters in order. e.g. concat('foo', 'bar') results in 'foobar'. Note that this function can take any number of input strings. || - ||string ||substr(string A, int start) ||returns the substring of A starting from start position till the end of string A e.g. concat('foobar', 4) results in 'bar' (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substr|| MySQL substr]]])|| + ||string ||substr(string A, int start) substring(string A, int start) ||returns the substring of A starting from start position till the end of string A e.g. concat('foobar', 4) results in 'bar' (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substr]]]) || - ||string ||substr(string A, int start, int len) ||returns the substring of A starting from start position with length len e.g. concat('foobar', 4, 1) results in 'b' (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substr|| MySQL substr]]]) || + ||string ||substr(string A, int start, int len) substring(string A, int start, int len) ||returns the substring of A starting from start position with length len e.g. concat('foobar', 4, 1) results in 'b' (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substr]]]) || - ||string ||upper(string A)||returns the string resulting from converting all characters of A to upper case e.g. upper('fOoBaR') results in 'FOOBAR'|| + ||string ||upper(string A) ucase(string A) ||returns the string resulting from converting all characters of A to upper case e.g. upper('fOoBaR') results in 'FOOBAR' || - ||string ||ucase(string A) ||Same as upper|| - ||string ||lower(string A) ||returns the string resulting from converting all characters of B to lower case e.g. lower('fOoBaR') results in 'foobar'|| + ||string ||lower(string A) lcase(string A) ||returns the string resulting from converting all characters of B to lower case e.g. lower('fOoBaR') results in 'foobar' || - ||string ||lcase(string A) ||Same as lower|| - ||string ||trim(string A) ||returns the string resulting from trimming spaces from both ends of A e.g. trim(' foobar ') results in 'foobar'|| + ||string ||trim(string A) ||returns the string resulting from trimming spaces from both ends of A e.g. trim(' foobar ') results in 'foobar' || - ||string ||ltrim(string A) ||returns the string resulting from trimming spaces from the beginning(left hand side) of A e.g. ltrim(' foobar ') results in 'foobar '|| + ||string ||ltrim(string A) ||returns the string resulting from trimming spaces from the beginning(left hand side) of A e.g. ltrim(' foobar ') results in 'foobar ' || - ||string ||rtrim(string A) ||returns the string resulting from trimming spaces from the end(right hand side) of A e.g. rtrim(' foobar ') results in ' foobar'|| + ||string ||rtrim(string A) ||returns the string resulting from trimming spaces from the end(right hand side) of A e.g. rtrim(' foobar ') results in ' foobar' || - ||string ||regexp_replace(string A, string B, string C) ||returns the string resulting from replacing all substrings in B that match the Java regular expression syntax(See Java regular expressions syntax) with C e.g. regexp_replace('foobar', 'oo|ar', '') returns 'fb'|| + ||string ||regexp_replace(string A, string B, string C) ||returns the string resulting from replacing all substrings in B that match the Java regular expression syntax(See Java regular expressions syntax) with C e.g. regexp_replace('foobar', 'oo|ar', '') returns 'fb''' || - ||string ||regexp_extract(string subject, string pattern, int intex) ||returns the string extracted using the pattern. e.g. regexp_extract('foothebar', 'foo(.*?)(bar)', 2) returns 'bar'|| + ||string ||regexp_extract(string subject, string pattern, int intex) ||returns the string extracted using the pattern. e.g. regexp_extract('foothebar', 'foo(.*?)(bar)', 2) returns 'bar' || - ||string ||parse_url(string urlString, string partToExtract) ||returns the part from the url. e.g. parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') returns 'facebook.com'|| + ||string ||parse_url(string urlString, string partToExtract) ||Returns the specified part from the URL. Valid values for partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO. e.g. parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') returns 'facebook.com' Also you can get a value of particular key in QUERY, using the syntax QUERY:<KEY_NAME> e.g. QUERY:k1. || - ||string ||get_json_object(string json_string, string path) ||Extract json object from a json string based on json path specified, and return json string of the extracted json object. It will return null if the input json string is invalid|| + ||string ||get_json_object(string json_string, string path) ||Extract json object from a json string based on json path specified, and return json string of the extracted json object. It will return null if the input json string is invalid || - ||string ||space(int n) ||Return a string of n spaces|| + ||string ||space(int n) ||Return a string of n spaces || - ||string ||repeat(string str, int n) ||Repeat str n times|| + ||string ||repeat(string str, int n) ||Repeat str n times || - ||string ||ascii(string str) ||Return the numeric value of the first character of str|| + ||string ||ascii(string str) ||Return the numeric value of the first character of str || - ||string ||lpad(string str1, string str2) ||Pad str1 to the left with str2|| + ||string ||lpad(string str1, string str2) ||Pad str1 to the left with str2 || - ||string ||rpad(string str1, string str2) ||Pad str1 to the right with str2|| + ||string ||rpad(string str1, string str2) ||Pad str1 to the right with str2 || - ||list ||split(string str, string pat) ||Split str around pat (pat is a regular expression)|| + ||list ||split(string str, string pat) ||Split str around pat (pat is a regular expression) || + ||int||find_in_set(string str, string strList)||Returns the first occurance of str in strList where strList is a comman-delimited string. Returns null if either argument is null. Returns 0 if the first argument contains any commas. e.g. find_in_set('ab', 'abc,b,ab,c,def') returns 3 || + + + ==== get_json_object ==== A limited version of JSONPath supported: + - * $ : Root object + * $ : Root object - * . : Child operator + * . : Child operator - * [] : Subscript operator for array + * [] : Subscript operator for array - * * : Wildcard for [] + * * : Wildcard for [] + Syntax not supported that's worth noticing: + - * '' : Zero length string as key + * : Zero length string as key - * .. : Recursive descent + * .. : Recursive descent - * @ : Current object/element + * @ : Current object/element - * () : Script expression + * () : Script expression - * ?() : Filter (script) expression. + * ?() : Filter (script) expression. - * [,] : Union operator + * [,] : Union operator - * [start:end:step] : array slice operator + * [start:end:step] : array slice operator - Example: + - src_json table is a single column (json), single row table: + Example: src_json table is a single column (json), single row table: + {{{ +-----------------------------------------------------------------------------------------------------------------------------------------+ json @@ -163, +198 @@ +-----------------------------------------------------------------------------------------------------------------------------------------+ }}} You can extract json object using these queries: + {{{ hive> SELECT get_json_object(src_json.json, '$.owner') FROM src_json; amy @@ -172, +208 @@ hive> SELECT get_json_object(src_json.json, '$.non_exist_key') FROM src_json; NULL - }}} - == Built in Aggregate (GROUP BY) Functions == The following are built in aggregate functions are supported in hive: - ||<10%>'''Return Type'''||<10%>'''Name(Signature)'''||'''Description|| + ||<10%>'''Return Type''' ||<10%>'''Name(Signature)''' ||'''Description''' || - ||BIGINT ||count(1), count(DISTINCT col [, col]...)||count(1) returns the number of members in the group, whereas the count(DISTINCT col) gets the count of distinct values of the columns in the group|| + ||BIGINT ||count(1), count(DISTINCT col [, col]...) ||count(1) returns the number of members in the group, whereas the count(DISTINCT col) gets the count of distinct values of the columns in the group || - ||DOUBLE ||sum(col), sum(DISTINCT col) ||returns the sum of the elements in the group or the sum of the distinct values of the column in the group|| + ||DOUBLE ||sum(col), sum(DISTINCT col) ||returns the sum of the elements in the group or the sum of the distinct values of the column in the group || - ||DOUBLE ||avg(col), avg(DISTINCT col) ||returns the average of the elements in the group or the average of the distinct values of the column in the group|| + ||DOUBLE ||avg(col), avg(DISTINCT col) ||returns the average of the elements in the group or the average of the distinct values of the column in the group || - ||DOUBLE ||min(col) ||returns the minimum of the column in the group|| + ||DOUBLE ||min(col) ||returns the minimum of the column in the group || - ||DOUBLE ||max(col) ||returns the maximum value of the column n the group|| + ||DOUBLE ||max(col) ||returns the maximum value of the column n the group || + +
