UDF" by Da veBrondsema

Apache Wiki Mon, 13 Dec 2010 11:07:40 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hive/LanguageManual/UDF" page has been changed by DaveBrondsema.
The comment on this change is: add collect_set.
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=diff&rev1=57&rev2=58

--------------------------------------------------

  ||double ||percentile_approx(col, p [, B]) || Returns an approximate p^th^ 
percentile of a numeric column (including floating point types) in the group. 
The B parameter controls approximation accuracy at the cost of memory. Higher 
values yield better approximations, and the default is 10,000. When the number 
of distinct values in col is smaller than B, this gives an exact percentile 
value. ||
  ||array<double> || percentile_approx(col, array(p,,1,, [, p,,2,,]...) [, B]) 
|| Same as above, but accepts and returns an array of percentile values instead 
of a single one. ||
  ||array<struct `{'x','y'}`>|| histogram_numeric(col, b) || Computes a 
histogram of a numeric column in the group using b non-uniformly spaced bins. 
The output is an array of size b of double-valued (x,y) coordinates that 
represent the bin centers and heights ||
+ ||array ||collect_set(col) ||Returns a set of objects with duplicate elements 
eliminated ||
  
  == Built-in Table-Generating Functions (UDTF) ==
  <<Anchor(UDTF)>> Normal user-defined functions, such as concat(), take in a 
single input row and output a single output row. In contrast, table-generating 
functions transform a single input row to multiple output rows.

[Hadoop Wiki] Update of "Hive/LanguageManual/UDF" by Da veBrondsema

Reply via email to