I tried this. Unfortunately, both tables are large.

Thanks,
Rakesh

________________________________
From: Namit Jain [mailto:[email protected]]
Sent: Monday, June 29, 2009 5:05 PM
To: [email protected]
Subject: RE: Set difference in Hive

The tables can be large  -

For a given key,  have the table with the most number of values as the 
rightmost table.

The problem only happens when both the tables have keys with large number of 
values.


Thanks,
-namit


From: Rakesh Setty [mailto:[email protected]]
Sent: Monday, June 29, 2009 4:43 PM
To: [email protected]
Subject: RE: Set difference in Hive


Thanks very much. But the reducer hangs with the warning WARN 
org.apache.hadoop.hive.ql.exec.JoinOperator: table 0 has more than 
joinEmitInterval rows for join key []

Both the tables are large and as Zheng mentions at 
http://www.mail-archive.com/[email protected]/msg00640.html, large 
size for table 0 is a problem. Is there any way to overcome this?



Thanks,

Rakesh

________________________________
From: Peter Skomoroch [mailto:[email protected]]
Sent: Monday, June 29, 2009 4:20 PM
To: [email protected]
Subject: Re: Set difference in Hive

Here is an example of what Amr mentioned from one of my Hive scripts, returns 
the set of pages not in "daily_pagecounts_table"

select dt.page_id, dt.dates, dt.pageviews, dt.total_pageviews
FROM daily_timelines dt LEFT OUTER JOIN daily_pagecounts_table dp ON 
(dt.page_id = dp.page_id)
where dp.page_id is NULL
On Mon, Jun 29, 2009 at 7:14 PM, Amr Awadallah 
<[email protected]<mailto:[email protected]>> wrote:

do an outer join on user and filter on name.user is null

-- amr


Rakesh Setty wrote:

Hi,



            I am new to Hive. I would like to know what is the easiest way to 
get the difference between two sets. For example, how can I convert the 
following SQL query to Hive?



select user from page_views where user not in (select name from users);



Thanks,

Rakesh



--
Peter N. Skomoroch
617.285.8348
http://www.datawrangling.com
http://delicious.com/pskomoroch
http://twitter.com/peteskomoroch

Reply via email to