Thanks Rekha for such a quick response. A few more doubts out here If I use the comparison operators on the dates directly would they give a desired result ?,as the dates are stored in Hive tables as String Also in the comparison of dates if we use the unix_timestamp() it would consider the time stamp as well along with date for comparison right?
unix_timestamp(field4) >= unix_timestamp(RJC.START_DATE) AND unix_timestamp(field4) <= unix_timestamp(RJC.END_DATE) Regards Bejoy.K.S ________________________________ From: Rekha Joshi <rekha...@yahoo-inc.com> To: "user@hive.apache.org" <user@hive.apache.org> Sent: Tue, March 8, 2011 4:36:08 PM Subject: Re: How to support SQL NOT IN function in Hive QL Re: How to support SQL NOT IN function in Hive QL Bejoy – If timestamp of the dates is not of much importance to you, you can alternatively use >= ‘start_date’ and <= ‘end_date’. Thanks, Rekha. On 3/8/11 4:27 PM, "Bejoy Ks" <bejoy...@yahoo.com> wrote: Thanks Rekha. I went with your first option 'LEFT OUTER JOIN' and it worked like a charm. >The second one was not fitting for my case as it was popping out parse errors >due to multiple columns separated by comma coming under the same NOT IN clause > {(field1,field2,field3,field4 )NOT IN (SELECT field1,field2,field3,field4 ...} >Guess hive doesn't support queries that way >Now there is minor hurdle still, The SQL BETWEEN. How can we get this BETWEEN >AND supported in Hive QL? >Any thoughts? > >My new transformed query would look like this >INSERT OVERWRITE TABLE Table1 >SELECT T2.field1,T2.field2,T2.field3,T2.field4 >FROM Table2 T2 JOIN Table3 RJC ON (RJC.field3 = 'xyz') LEFT OUTER JOIN Table4 >T4 >ON > >(T2.field1 = T4.field1 AND T2.field2 = T4.field2 AND T2.field3 = T4.field3 AND >T2.field4 = T4.field4) >WHERE field4 BETWEEN RJC.START_DATE AND RJC.END_DATE >GROUP BY T1_field1,T1_field2_ID,T1_field3,T1_field4; > > > >Regards >Bejoy.K.S > > > > ________________________________ From:Rekha Joshi <rekha...@yahoo-inc.com> >To: "user@hive.apache.org" <user@hive.apache.org> >Sent: Tue, March 8, 2011 3:08:08 PM >Subject: Re: How to support SQL NOT IN function in Hive QL > >Re: How to support SQL NOT IN function in Hive QL @Bejoy – AFAIK, NOT IN is >not >directly supported in current hive. Workaround, you can write an outer join >instead of antijoin. HIVE-1740, mentions another workaround - NOT(x LIKE p). >Thanks, >Rekha. > >On 3/8/11 2:12 PM, "Bejoy Ks" <bejoy...@yahoo.com> wrote: > > >Hi Experts >> I'm facing a hurdle in transforming a SQL query to equivalent Hive QL >> with >>SQL NOT IN functionality. My SQL query would like this >> >>INSERT INTO Table1 >>(T1_field1,T1_field2_ID,T1_field3,T1_field4) >>SELECT field1,field2,field3,field4 >>FROM Table2 JOIN Table3 T3 >>WHERE (field1,field2,field3,field4 ) NOT IN >>(SELECT field1,field2,field3,field4 FROM Table4) >>AND field4 BETWEEN T3.START_DATE AND T3.END_DATE >>AND T3.field3 = 'xyz' GROUP BY T1_field1,T1_field2_ID,T1_field3,T1_field4; >> >> >>I'm comfortable with the rest of the query apart from the NOT IN part. I'm >>using >>Hive 0.7 with CDH3B4. I check the availability of SQL IN functionality in my >>hive and it worked like a charm but NOT IN doesn't work out here. Has it been >>already included as a new patch in latest trunk? >>Can some one help me out how i can re frame the query avoiding the NOT IN >>part >>to get it working on Hive?(I'm not really from a db background) >>Also is there any JIRA tickets open to have this functionality supported in >>upcoming versions of hive? >> >>Regards >>Bejoy.K.S >> >> >> >>