Yes, the table should be scanned only once.

From: Neil Xu [mailto:neil.x...@gmail.com]
Sent: Thursday, August 26, 2010 11:00 AM
To: hive-user@hadoop.apache.org
Subject: Re: How is Union All optimized in Hive

Hi, Namit,

Thanks for your reply, now I see that hive will optimize those kinds of jobs, 
but when I use 'explain' to see the syntax tree of the hql, I find 3 table scan 
in the tree, is table_1 really scanned only once? I am not quite familiar with 
the syntax tree.

2010/8/26 Namit Jain <nj...@facebook.com<mailto:nj...@facebook.com>>
Yes, it is optimized by hive. There will be only 1 mr job, even if the columns 
selected were different.


-namit

________________________________________
From: Neil Xu [neil.x...@gmail.com<mailto:neil.x...@gmail.com>]
Sent: Wednesday, August 25, 2010 2:40 AM
To: hive-user@hadoop.apache.org<mailto:hive-user@hadoop.apache.org>
Subject: How is Union All optimized in Hive

I tried a query like below, same table, same column and different conditions, 
only one MR job generated,  is it optimized by Hive itself? and is the' 
table_1' only scanned once? who can give some details, thanks!

select a, b, c from table_1 where ...
union all
select a, b, c from table_1 where ...
union all
select a, b, c from table_1 where ...

Chocobo

Reply via email to