Thanks Rekha. I went with your first option 'LEFT OUTER JOIN' and it worked like a charm. The second one was not fitting for my case as it was popping out parse errors due to multiple columns separated by comma coming under the same NOT IN clause {(field1,field2,field3,field4 )NOT IN (SELECT field1,field2,field3,field4 ...} Guess hive doesn't support queries that way Now there is minor hurdle still, The SQL BETWEEN. How can we get this BETWEEN AND supported in Hive QL? Any thoughts?
My new transformed query would look like this INSERT OVERWRITE TABLE Table1 SELECT T2.field1,T2.field2,T2.field3,T2.field4 FROM Table2 T2 JOIN Table3 RJC ON (RJC.field3 = 'xyz') LEFT OUTER JOIN Table4 T4 ON (T2.field1 = T4.field1 AND T2.field2 = T4.field2 AND T2.field3 = T4.field3 AND T2.field4 = T4.field4) WHERE field4 BETWEEN RJC.START_DATE AND RJC.END_DATE GROUP BY T1_field1,T1_field2_ID,T1_field3,T1_field4; Regards Bejoy.K.S ________________________________ From: Rekha Joshi <rekha...@yahoo-inc.com> To: "user@hive.apache.org" <user@hive.apache.org> Sent: Tue, March 8, 2011 3:08:08 PM Subject: Re: How to support SQL NOT IN function in Hive QL Re: How to support SQL NOT IN function in Hive QL @Bejoy – AFAIK, NOT IN is not directly supported in current hive. Workaround, you can write an outer join instead of antijoin. HIVE-1740, mentions another workaround - NOT(x LIKE p). Thanks, Rekha. On 3/8/11 2:12 PM, "Bejoy Ks" <bejoy...@yahoo.com> wrote: Hi Experts > I'm facing a hurdle in transforming a SQL query to equivalent Hive QL with >SQL NOT IN functionality. My SQL query would like this > >INSERT INTO Table1 >(T1_field1,T1_field2_ID,T1_field3,T1_field4) >SELECT field1,field2,field3,field4 >FROM Table2 JOIN Table3 T3 >WHERE (field1,field2,field3,field4 ) NOT IN >(SELECT field1,field2,field3,field4 FROM Table4) >AND field4 BETWEEN T3.START_DATE AND T3.END_DATE >AND T3.field3 = 'xyz' GROUP BY T1_field1,T1_field2_ID,T1_field3,T1_field4; > > >I'm comfortable with the rest of the query apart from the NOT IN part. I'm >using >Hive 0.7 with CDH3B4. I check the availability of SQL IN functionality in my >hive and it worked like a charm but NOT IN doesn't work out here. Has it been >already included as a new patch in latest trunk? >Can some one help me out how i can re frame the query avoiding the NOT IN part >to get it working on Hive?(I'm not really from a db background) >Also is there any JIRA tickets open to have this functionality supported in >upcoming versions of hive? > >Regards >Bejoy.K.S > > > >