Hi All,

Please help me understand where and how does hive store temporary result of a 
nested query.

I have written a UDF which reads the data from a table t1 in a nested query.
Table t1 should be in ascending order and I have to make sure that t1 data 
should be processed by a single mapper. The reason for single mapper is that my 
UDF contains some global variables which gets initialized per mapper and if t1 
is processed by multiple mappers then output would result wrong.

Query:

select 
gsid,contract,max_date,min_date,contract_rangeId(gsid,contract,max_date,min_date)
 as range_id from (select gsid,contract,max_date,min_date from 
tmp_rcc_normwk_gs0_test3 order by gsid,contract,max_date,min_date) t1.

Since the nested query select gsid,contract,max_date,min_date from 
tmp_rcc_normwk_gs0_test3 order by gsid,contract,max_date,min_date runs only one 
reducer, will the outer query runs with only 1 mapper?
If yes, where does the output of nested query stored? HDFS or local file system?
Love to get some help on this.
[http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]

Vikash Talanki
Engineer - Software
[email protected]
Phone: +1 (408)838 4078

Cisco Systems Limited
SJ-J 3
255 W Tasman Dr
San Jose
CA - 95134
United States
Cisco.com<http://www.cisco.com/>





[Think before you print.]Think before you print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html



Reply via email to