This might be somewhat off-topic, but is the query optimizer smart enough to conditions like 'date2 = date_sub(date1,1)' execute efficiently? I would certainly hope so when date1 is a constant, but what if this is a join condition? If date is a partition or clustered column, will hive do the right thing?
Jamie Olson On Thu, Sep 20, 2012 at 10:27 AM, Bejoy KS <bejoy...@yahoo.com> wrote: > Hi Robin > > AFAIK, one of the possible solutions is is to take the computation of > current date out of hive, may be to a shell script (you can have the > computation current - x days aslo from within a shell script i guess ). > Once you have identified the required date, store it in a variable and > substitute the same in your hive query. > > Some links for your reference > http://hive.apache.org/docs/r0.9.0/language_manual/var_substitution.html > > http://kickstarthadoop.blogspot.in/2011/10/include-values-during-execution-time-in.html > > > Regards, > Bejoy KS > > ------------------------------ > *From:* Robin Verlangen <ro...@us2.nl> > *To:* user@hive.apache.org; Bejoy KS <bejoy...@yahoo.com> > *Sent:* Thursday, September 20, 2012 7:30 PM > *Subject:* Re: Hive ignoring buckets when using dynamic where > > Hi Bejoy, > > Thank you for your reply. Is there any way to fix my problem? I want to > have a query that has a dynamic range, from now (and in some cases now - x > days until now). > > Best regards, > > Robin Verlangen > *Software engineer* > * > * > W http://www.robinverlangen.nl > E ro...@us2.nl > > Disclaimer: The information contained in this message and attachments is > intended solely for the attention and use of the named addressee and may be > confidential. If you are not the intended recipient, you are reminded that > the information remains the property of the sender. You must not use, > disclose, distribute, copy, print or rely on this e-mail. If you have > received this message in error, please contact the sender immediately and > irrevocably delete this message and any copies. > > > > 2012/9/20 Bejoy KS <bejoy...@yahoo.com> > > Hi Robin > > The result of 'bdate=to_date(unix_timestamp())' is evaluated during the > runtime of the query. But the data that a query should process is > determined initially before executing the map reduce jobs. That is the > reason the query is running over whole data set. > > When you provide 'bdate='2012-09-01'' the hive parser knows initially > itself what data which all partitions should be taken into account. So this > query runs on only the required partitions and not on whole data. > > To add on , it is not the buckets considered here on where clause but the > partitions. > > Regards, > Bejoy KS > > ------------------------------ > *From:* Robin Verlangen <ro...@us2.nl> > *To:* user@hive.apache.org > *Sent:* Thursday, September 20, 2012 5:06 PM > *Subject:* Hive ignoring buckets when using dynamic where > > Hi there, > > We're working on some queries that use buckets to improve performance with > like 1000x. However we ran into a problem. When we use a fixed hardcoded > date it works fine: > > SELECT * FROM standard_feed WHERE bdate='2012-09-01' > *Starts a job with 6 mappers, 2 reducers* > > When we use it dynamically: > SELECT * FROM standard_feed WHERE bdate=to_date(unix_timestamp()) > *Starts a job with 1000 mappers, 2 reducers* > * > * > What's the problem here? The result of the to_date of the current > timestamp should be equal to a normal fixed date? Does anyone have a > solution? > > Best regards, > > Robin Verlangen > *Software engineer* > * > * > W http://www.robinverlangen.nl > E ro...@us2.nl > > Disclaimer: The information contained in this message and attachments is > intended solely for the attention and use of the named addressee and may be > confidential. If you are not the intended recipient, you are reminded that > the information remains the property of the sender. You must not use, > disclose, distribute, copy, print or rely on this e-mail. If you have > received this message in error, please contact the sender immediately and > irrevocably delete this message and any copies. > > > > > > >