Thanks Rekha. I went with your first option 'LEFT OUTER JOIN' and it worked 
like 
a charm.
The second one was not fitting for my case as it was popping out parse errors 
due to multiple columns separated by comma coming under the same NOT IN clause  
{(field1,field2,field3,field4 )NOT IN (SELECT field1,field2,field3,field4 ...}
Guess hive doesn't support queries that way
Now there is minor hurdle still, The SQL BETWEEN. How can we get this BETWEEN 
AND supported in Hive QL?
Any thoughts?

My new transformed query would look like this
INSERT OVERWRITE TABLE Table1 
SELECT T2.field1,T2.field2,T2.field3,T2.field4 
FROM Table2 T2 JOIN Table3 RJC ON (RJC.field3 = 'xyz') LEFT OUTER JOIN Table4 
T4 
ON 

(T2.field1 = T4.field1 AND T2.field2 = T4.field2 AND T2.field3 = T4.field3 AND 
T2.field4 = T4.field4)
WHERE field4 BETWEEN RJC.START_DATE AND RJC.END_DATE
GROUP BY T1_field1,T1_field2_ID,T1_field3,T1_field4;



Regards
Bejoy.K.S






________________________________
From: Rekha Joshi <rekha...@yahoo-inc.com>
To: "user@hive.apache.org" <user@hive.apache.org>
Sent: Tue, March 8, 2011 3:08:08 PM
Subject: Re: How to support SQL NOT IN function in Hive QL

 Re: How to support SQL NOT IN function in Hive QL @Bejoy – AFAIK, NOT IN is 
not 
directly supported in current hive. Workaround, you can write an outer join 
instead of antijoin. HIVE-1740, mentions another workaround - NOT(x LIKE p).
Thanks,
Rekha.

On 3/8/11 2:12 PM, "Bejoy Ks" <bejoy...@yahoo.com> wrote:


Hi Experts
>    I'm facing a hurdle in transforming a SQL query to equivalent Hive QL with 
>SQL NOT IN functionality. My SQL query would like this
>
>INSERT INTO Table1
>(T1_field1,T1_field2_ID,T1_field3,T1_field4)
>SELECT field1,field2,field3,field4 
>FROM Table2 JOIN Table3 T3
>WHERE (field1,field2,field3,field4 ) NOT IN
>(SELECT field1,field2,field3,field4  FROM Table4)
>AND field4 BETWEEN T3.START_DATE  AND T3.END_DATE
>AND T3.field3 = 'xyz' GROUP BY T1_field1,T1_field2_ID,T1_field3,T1_field4;
>
>
>I'm comfortable with the rest of the query apart from the NOT IN part. I'm 
>using 
>Hive 0.7 with CDH3B4. I check the availability of SQL IN functionality in my 
>hive and it worked like a charm but NOT IN doesn't work out here. Has it been 
>already included as a new patch in latest trunk?
>Can some one help me out how i can re frame the query avoiding the NOT IN part 
>to get it working on Hive?(I'm not really from a db background)
>Also is there any JIRA tickets open to have this functionality supported in 
>upcoming versions of hive?
>
>Regards
>Bejoy.K.S
>
>
> 
>


      

Reply via email to