Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/LanguageManual/UDF" page has been changed by PaulYang. http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=diff&rev1=28&rev2=29 -------------------------------------------------- ||double ||sqrt(double a) ||Return the square root of a || ||string ||bin(BIGINT a) ||Return the number in binary format (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_bin]]]) || ||string ||hex(BIGINT a) hex(string a) ||If the argument is a int, hex returns the number as a string in hex format. Otherwise if the number is a string, it converts each character into its hex representation and returns the resulting string. (see [[[http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_hex]]]) || - ||string||unhex(string a)||Inverse of hex. Interprets each pair of characters as a hexidecimal number and converts to the character represented by the number.|| + ||string ||unhex(string a) ||Inverse of hex. Interprets each pair of characters as a hexidecimal number and converts to the character represented by the number. || ||string ||conv(BIGINT num, int from_base, int to_base) ||Convert a number from a given base to another (see [[[http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_conv]]]) || ||double ||abs(double a) ||Return the absolute value || - ||int double ||pmod(int a, int b) pmod(double a, double b) ||Returns the positive value of a mod b|| + ||int double ||pmod(int a, int b) pmod(double a, double b) ||Returns the positive value of a mod b || - ||double||sin(double a)||Returns the sine of a (a is in radians)|| + ||double ||sin(double a) ||Returns the sine of a (a is in radians) || - ||double||asin(double a)||Returns the arc sin of x if -1<=a<=1 or null otherwise.|| + ||double ||asin(double a) ||Returns the arc sin of x if -1<=a<=1 or null otherwise || - ||double||cos(double a)||Returns the cosine of a (a is in radians)|| + ||double ||cos(double a) ||Returns the cosine of a (a is in radians) || - ||double||acos(double a)||Returns the arc cosine of x if -1<=a<=1 or null otherwise|| + ||double ||acos(double a) ||Returns the arc cosine of x if -1<=a<=1 or null otherwise || + ||int double ||positive(int a) positive(double a) ||Returns a || + ||int double ||negative(int a) negative(double a) ||Returns -a || @@ -103, +105 @@ === Type Conversion Functions === The following type conversion functions are supported in hive: - ||<type> ||cast(expr as <type>) ||converts the results of the expression expr to <type> e.g. cast('1' as BIGINT) will convert the string '1' to it integral representation. A null is returned if the conversion does not succeed. || + || <type> ||cast(expr as <type>) ||converts the results of the expression expr to <type> e.g. cast('1' as BIGINT) will convert the string '1' to it integral representation. A null is returned if the conversion does not succeed. || @@ -117, +119 @@ ||string ||to_date(string timestamp) ||Return the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01" || ||int ||year(string date) ||Return the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970 || ||int ||month(string date) ||Return the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11 || - ||int ||day(string date) ||Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1 || + ||int ||day(string date) dayofmonth(date) ||Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1 || + ||int ||hour(string date) ||Returns the hour of the timestamp: hour('2009-07-30 12:58:59') = 12, hour('12:58:59') = 12 || + ||int ||minute(string date) ||Returns the minute of the timestamp || + ||int ||second(string date) ||Returns the second of the timestamp || ||int ||weekofyear(string date) ||Return the week number of a timestamp string: weekofyear("1970-11-01 00:00:00") = 44, weekofyear("1970-11-01") = 44 || ||int ||datediff(string enddate, string startdate) ||Return the number of days from startdate to enddate: datediff('2009-03-01', '2009-02-27') = 2 || ||int ||date_add(string startdate, int days) ||Add a number of days to startdate: date_add('2008-12-31', 1) = '2009-01-01' || @@ -145, +150 @@ ||string ||trim(string A) ||returns the string resulting from trimming spaces from both ends of A e.g. trim(' foobar ') results in 'foobar' || ||string ||ltrim(string A) ||returns the string resulting from trimming spaces from the beginning(left hand side) of A e.g. ltrim(' foobar ') results in 'foobar ' || ||string ||rtrim(string A) ||returns the string resulting from trimming spaces from the end(right hand side) of A e.g. rtrim(' foobar ') results in ' foobar' || - ||string ||regexp_replace(string A, string B, string C) ||returns the string resulting from replacing all substrings in B that match the Java regular expression syntax(See Java regular expressions syntax) with C e.g. regexp_replace('foobar', 'oo|ar', '') returns 'fb''' || + ||string ||regexp_replace(string A, string B, string C) ||returns the string resulting from replacing all substrings in B that match the Java regular expression syntax(See Java regular expressions syntax) with C e.g. regexp_replace('foobar', 'oo|ar', '') returns 'fb''' ''''' || ||string ||regexp_extract(string subject, string pattern, int intex) ||returns the string extracted using the pattern. e.g. regexp_extract('foothebar', 'foo(.*?)(bar)', 2) returns 'bar' || ||string ||parse_url(string urlString, string partToExtract) ||Returns the specified part from the URL. Valid values for partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO. e.g. parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') returns 'facebook.com' Also you can get a value of particular key in QUERY, using the syntax QUERY:<KEY_NAME> e.g. QUERY:k1. || ||string ||get_json_object(string json_string, string path) ||Extract json object from a json string based on json path specified, and return json string of the extracted json object. It will return null if the input json string is invalid || @@ -155, +160 @@ ||string ||lpad(string str1, string str2) ||Pad str1 to the left with str2 || ||string ||rpad(string str1, string str2) ||Pad str1 to the right with str2 || ||list ||split(string str, string pat) ||Split str around pat (pat is a regular expression) || - ||int||find_in_set(string str, string strList)||Returns the first occurance of str in strList where strList is a comman-delimited string. Returns null if either argument is null. Returns 0 if the first argument contains any commas. e.g. find_in_set('ab', 'abc,b,ab,c,def') returns 3 || + ||int ||find_in_set(string str, string strList) ||Returns the first occurance of str in strList where strList is a comman-delimited string. Returns null if either argument is null. Returns 0 if the first argument contains any commas. e.g. find_in_set('ab', 'abc,b,ab,c,def') returns 3 || ==== get_json_object ==== - A limited version of JSONPath supported: + A limited version of JSONPath is supported: * $ : Root object * . : Child operator @@ -211, +216 @@ }}} == Built in Aggregate (GROUP BY) Functions == The following are built in aggregate functions are supported in hive: - ||<10%>'''Return Type''' ||<10%>'''Name(Signature)''' ||'''Description''' || + ||<10%>Return Type''' ''' ||<10%>Name(Signature)''' ''' ||Description''' ''' || ||BIGINT ||count(1), count(DISTINCT col [, col]...) ||count(1) returns the number of members in the group, whereas the count(DISTINCT col) gets the count of distinct values of the columns in the group || ||DOUBLE ||sum(col), sum(DISTINCT col) ||returns the sum of the elements in the group or the sum of the distinct values of the column in the group || ||DOUBLE ||avg(col), avg(DISTINCT col) ||returns the average of the elements in the group or the average of the distinct values of the column in the group || ||DOUBLE ||min(col) ||returns the minimum of the column in the group || ||DOUBLE ||max(col) ||returns the maximum value of the column n the group || - -
