Yes Ajo time was the concern. I just got a chance though the in memory join implementation within hive and it is great. Sorry for the confusion :)
________________________________ From: Ajo Fod <ajo....@gmail.com> To: user@hive.apache.org Sent: Wed, February 23, 2011 8:01:07 PM Subject: Re: Database/Schema , INTERVAL and SQL IN usages in Hive Better in what sense? ... if it is time you are concerned about there are in memory joins. -Ajo On Wed, Feb 23, 2011 at 3:39 AM, Bejoy Ks <bejoy...@yahoo.com> wrote: Ajo, > If we have a good number of elements in the comparison set then going for > a >table would be beneficial. But in case of a few elements say 5 wont multiple >'=' >be better? > >Regards >Bejoy KS > > > > ________________________________ From: Ajo Fod <ajo....@gmail.com> > >To: user@hive.apache.org >Sent: Mon, February 21, 2011 10:04:41 PM > >Subject: Re: Database/Schema , INTERVAL and SQL IN usages in Hive > > >On using SQL IN ... what would happen if you created a short table with the >enteries in the IN clause and used a "inner join" ? > >-Ajo > > >On Mon, Feb 21, 2011 at 7:57 AM, Bejoy Ks <bejoy...@yahoo.com> wrote: > >Thanks Jov for the quick response >> >>Could you please let me know which is the latest stable version of hive. Also >>how would you find out your hive version from command line? >> >>Regarding the SQL IN I'm also currently using multiple '=' in my jobs, but >>still wanted to know whether there would be some better usage for the same >>apart >>from this. >> >> >> >>Regards >>Bejoy KS >> >> >> >> >> >> >> ________________________________ From: Jov <zhao6...@gmail.com> >>To: user@hive.apache.org >>Sent: Mon, February 21, 2011 9:09:34 PM >>Subject: Re: Database/Schema , INTERVAL and SQL IN usages in Hive >> >> >> >> >>在 2011-2-21 下午10:54,"Bejoy Ks" <bejoy...@yahoo.com>写道: >>> >>> Hi Experts >>> I'm using hive for a few projects and i found it a great tool in >>> hadoop to >>>process end to end structured data. Unfortunately I'm facing a few >>>challenges >>>out here as follows >>> >>> Availability of database/schemas in Hive >>> I'm having multiple projects running in hive each having fairly large >>> number of >>>tables. With this much tables all together it is looking a bit messed up. >>>Is >>>there any option of creating database/schema in Hive so that I can maintain >>>the >>>tables in different databases/schemas corresponding to each project. >>it seems the resent version has already support database ddl,so,you can use >>create database. >> >>> Using INTERVAL >>> I need to replicate a job running in Teradata edw into hive, i'm facing >>> a >>>challenge out here.Not able to identify a similar usage corresponding to >>>Interval in teradata within hive. Here is the snippet where I'm facing the >>>issue >>> *** where 1.seq_id = r4.seq_id and r4.mc_datetime >= (r1.rc_datetime + >>>INTERVAL '05' HOUR) >>> In this query how do i replicate the last part in hive ie (r1.rc_datetime + >>>INTERVAL '05' HOUR) , where it is adding 5 hours to the obtained time stamp >>>rc_datetime. >>> *The where condition is part of a very large query involving multiple table >>>joins. >>hive do not have date or timestamp data type,all such type is string,but you >>can >>write your udf to implement similar function >> >>> >>> Using IN >>> How do we replicate the SQL IN function in hive >>> ie *** where R1.seq_id = r4.seq_id and r1.PROCCESS_PHASE IN ( 'Production', >>>'Stage' , 'QA', 'Development') >>> the last part of the query is where i'm facing the challenge >>> r1.PROCCESS_PHASE >>>IN ( 'Production', 'Stage' , 'QA', 'Development') >>> *The where condition is part of a very large query involving multiple table >>>joins. >>you can use or,e.g. >>'x in(1,2)' can be 'x=1 or x=2' >>> Please advise. >>> >>> Regards >>> Bejoy KS >>> >>> >>> >>> >>> >>> >>> >> >> > >