Run this if you want to see the difference between doing and not doing TRUNC: select to_char(trunc(date,'DAY'),'yyyy-MM-dd hh'), to_char(date,'yyyy-MM-dd hh') from bi.events limit 1;
Sqlline is formatting the date if you don't do a TO_CHAR on it, not Phoenix. I still don't understand why it matters how it's printed out. The underlying date is not changing. On Tue, Feb 25, 2014 at 1:15 PM, Sean Huo <[email protected]> wrote: > here is what i get > > select trunc(date,'DAY'), to_char(date,'yyyy-MM-dd hh'),date from > bi.events limit 1; > > *+----------------------+---------------+--------------+* > > *| **FLOOR(TO_DATE(DATE))** | **TO_CHAR(DATE)** | ** DATE ** |* > > *+----------------------+---------------+--------------+* > > *| *2013-07-14 * | *2013-07-15 07* | *2013-07-15 00:01:02.346* |* > > *+----------------------+---------------+--------------+* > > > looks like to_char is formatting using gmt, where date is in pst. > > > On Tue, Feb 25, 2014 at 1:10 PM, James Taylor <[email protected]>wrote: > >> I'm not sure how sqlline prints date fields by default. Can you do a >> TO_CHAR(date) with a format string instead for both dates? >> >> The TRUNC function just truncates the date to a day boundary when you >> specify the 'DAY' argument. >> >> >> On Tue, Feb 25, 2014 at 12:59 PM, Sean Huo <[email protected]> wrote: >> >>> Hi James, >>> >>> I still don't understand the semantics of trunc function. >>> >>> Here is what is displayed on sqlline >>> >>> select trunc(date,'DAY'), date from events limit 1; >>> >>> *+----------------------+--------------+* >>> >>> *| **FLOOR(TO_DATE(DATE))** | ** DATE ** |* >>> >>> *+----------------------+--------------+* >>> >>> *| *2013-07-14 * | *2013-07-15 00:01:02.346* |* >>> >>> *+----------------------+--------------+* >>> >>> Can you tell me why it returns '2013-07-14' rather than '2013-07-15' >>> >>> Thanks >>> Sean >>> >>> >>> On Tue, Feb 25, 2014 at 12:18 PM, James Taylor >>> <[email protected]>wrote: >>> >>>> You'll get much better performance using the TRUNC function. See >>>> org.apache.phoenix.end2end.ProductMetricTest for some examples. >>>> >>>> Thanks, >>>> James >>>> >>>> >>>> On Tue, Feb 25, 2014 at 12:11 PM, Sean Huo <[email protected]>wrote: >>>> >>>>> My usecase is simple. I have a event table that has timestamp as part >>>>> of the key. I want to do a event count per day >>>>> >>>>> I could do this in phoenix >>>>> >>>>> select to_char(ts,'yyyy-MM-dd') day, count(1) from events group by day; >>>>> >>>>> it returns event count per day per GMT time. >>>>> >>>>> Now if I am interested in doing a event count per PST timezone, >>>>> >>>>> That query doesn't work any more. >>>>> >>>>> The round about way to do this is to execute a query per day like this >>>>> >>>>> select count(1) from events where ts between to_date('2014-02-20 >>>>> PST','yyyy-MM-dd Z') and to_date('2014-02-21 13 PST','yyyy-MM-dd Z') >>>>> >>>>> I will look into trunc function. There is not much documentation and >>>>> usage on the function. >>>>> >>>>> Thanks >>>>> Sean >>>>> >>>>> >>>>> On Tue, Feb 25, 2014 at 11:53 AM, James Taylor <[email protected] >>>>> > wrote: >>>>> >>>>>> Would it be possible to get a bit more info on your use case? Usually >>>>>> showing a date/time using a different timezone is a end-user display >>>>>> issue. >>>>>> How does this impact your group by? Grouping by a date/time will be the >>>>>> same regardless of the timezone you use to format your date. >>>>>> >>>>>> Do you know about our TRUNC and ROUND functions? >>>>>> http://phoenix.incubator.apache.org/language/functions.html#/truncate >>>>>> >>>>>> This is typically a good way to "bucketize" a date when you do a >>>>>> group by, like this: >>>>>> SELECT count(*) FROM t GROUP BY TRUNC(my_date,'DAY') >>>>>> >>>>>> You can use date arithmetic if you wanted to "shift" all the dates >>>>>> based on a timezone offset, like this (shifting 8 hours forward): >>>>>> SELECT count(*) FROM t GROUP BY TRUNC(my_date + 8.0/24.0,'DAY') >>>>>> >>>>>> HTH, >>>>>> James >>>>>> >>>>>> On Tue, Feb 25, 2014 at 11:41 AM, Sean Huo <[email protected]>wrote: >>>>>> >>>>>>> Well, I can not use to_date function since it expects a string input >>>>>>> while I have a timestamp. >>>>>>> Also doing is in java is not a solution since I want to do a group >>>>>>> by on the timestamp in a customized timezone. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 25, 2014 at 11:25 AM, James Taylor < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Have you tried using the TO_DATE in conjunction with the TO_CHAR, >>>>>>>> where you specify a different timezone in the TO_DATE format_arg? >>>>>>>> >>>>>>>> Another option is to do this in Java. When you do a >>>>>>>> resultSet.getDate("MY_DATE_COL"), you can do whatever you want with the >>>>>>>> Date you get back. >>>>>>>> >>>>>>>> We're definitely open to taking contributions for new built-in >>>>>>>> functions. They're pretty easy to add. Just follow this guide: >>>>>>>> http://phoenix-hbase.blogspot.com/2013/04/how-to-add-your-own-built-in-function.html >>>>>>>> >>>>>>>> Adding more date manipulation functions would be much appreciated. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> James >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Feb 25, 2014 at 11:16 AM, Sean Huo <[email protected]>wrote: >>>>>>>> >>>>>>>>> Well, to be frankly, the example on the to_char udf is wrong >>>>>>>>> >>>>>>>>> TO_CHAR(myDate, '2001-02-03 04:05:06') >>>>>>>>> does not produce the right result and is misleading. >>>>>>>>> >>>>>>>>> This function does not give one the ability to format the date in >>>>>>>>> a customized timezone. >>>>>>>>> ALl it does is to allow timezone to be included in the output, but >>>>>>>>> it is is always GMT. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Feb 25, 2014 at 11:04 AM, James Taylor < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://phoenix.incubator.apache.org/language/functions.html#/to_charwith >>>>>>>>>> a formatString argument. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Feb 25, 2014 at 10:58 AM, Sean Huo >>>>>>>>>> <[email protected]>wrote: >>>>>>>>>> >>>>>>>>>>> It seems that to_char udf always produces timestamp/date string >>>>>>>>>>> in GMT. >>>>>>>>>>> Is there a function that allows users to pass in a timezone >>>>>>>>>>> string so >>>>>>>>>>> that timestamp can be displayed accordingly? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> >>>>>>>>>>> Sean >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
