Transform" by ZhengShao

Apache Wiki Fri, 23 Jan 2009 00:31:29 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by ZhengShao:
http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform

------------------------------------------------------------------------------
  
  Users can also plug in their own custom mappers and reducers in the data 
stream by using features natively supported in the Hive 2.0 language. e.g. in 
order to run a custom mapper script - map_script - and a custom reducer script 
- reduce_script - the user can issue the following command which uses the 
TRANSFORM clause to embed the mapper and the reducer scripts.
  
- Note that columns will be transformed to string and deliminated by TAB before 
feeding to the user script, and the standard output of the user script will be 
treated as TAB-separated string columns. User scripts can output debug 
information to standard error which will be shown on the task detail page on 
hadoop.
+ Note that columns will be transformed to ''STRING'' and delimited by TAB 
before feeding to the user script, and the standard output of the user script 
will be treated as TAB-separated ''STRING'' columns. User scripts can output 
debug information to standard error which will be shown on the task detail page 
on hadoop.
  
  In the syntax, both ''MAP'' and ''REDUCE'' can be also written as ''SELECT 
TRANSFORM''.  There are actually no difference between these three.
  Hive runs the reduce script in the reduce task (instead of the map task) 
because of the ''clusterBy''/''distributeBy''/''sortBy'' clause in the inner 
query.

[Hadoop Wiki] Trivial Update of "Hive/LanguageManual/Transform" by ZhengShao

Reply via email to