So its working, but Im having a time zone issue. My servers are located in EST, but i need this data in PST.
So when it converts this: hive> select from_unixtime(1270145333,'yyyy-MM-dd HH:mm:ss') from ut2; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201003031204_0102, Tracking URL = http://master:50030/jobdetails.jsp?jobid=job_201003031204_0102 Kill Command = /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=master:54311 -kill job_201003031204_0102 2010-04-01 18:28:23,041 Stage-1 map = 0%, reduce = 0% 2010-04-01 18:28:37,315 Stage-1 map = 67%, reduce = 0% 2010-04-01 18:28:43,386 Stage-1 map = 100%, reduce = 0% 2010-04-01 18:28:46,412 Stage-1 map = 100%, reduce = 100% Ended Job = job_201003031204_0102 OK 2010-04-01 14:08:53 Time taken: 30.191 seconds I need it to be : 2010-04-01 11:08:53 I tried setting the variable in my .bash_profile for TZ=/ /Americas/ = no go. Nothing in the hive ddl link you is leading me in the right direction. Is there something you guys can recommend? I can write a script outside of hive, but it would be great if I can have users handle this within their queries. Thanks in advance! /tom On Thu, Apr 1, 2010 at 2:17 PM, tom kersnick <hiveu...@gmail.com> wrote: > ok thanks.... > > I should have caught that. > > /tom > > > > > On Thu, Apr 1, 2010 at 2:13 PM, Carl Steinbach <c...@cloudera.com> wrote: > >> Hi Tom, >> >> "Unix Time" is defined as the number of *seconds* since January 1, 1970. >> It looks like the data you have in cola is in milliseconds. You need to >> divide this value by 1000 before calling from_unixtime() on the result. >> >> Thanks. >> >> Carl >> >> >> On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick <hiveu...@gmail.com> wrote: >> >>> Thanks, but there is something fishy going on. >>> >>> Im using hive 0.5.0 with hadoop 0.20.1 >>> >>> I tried the column as both a bigint and a string. According the hive >>> ddl: >>> >>> string >>> >>> from_unixtime(int unixtime) >>> >>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) >>> to a string representing the timestamp of that moment in the current system >>> time zone in the format of "1970-01-01 00:00:00" >>> >>> It looks like the input is int, that would be too small for my >>> 1270145333155 timestamp. >>> >>> Any ideas? >>> >>> Example below: >>> >>> /tom >>> >>> >>> hive> describe ut; >>> OK >>> cola bigint >>> colb string >>> Time taken: 0.101 seconds >>> >>> >>> hive> select * from ut; >>> OK >>> 1270145333155 tuesday >>> Time taken: 0.065 seconds >>> >>> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut; >>> >>> Total MapReduce jobs = 1 >>> Launching Job 1 out of 1 >>> Number of reduce tasks is set to 0 since there's no reduce operator >>> Starting Job = job_201003031204_0083, Tracking URL = >>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083 >>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job >>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083 >>> 2010-04-01 16:57:32,407 Stage-1 map = 0%, reduce = 0% >>> 2010-04-01 16:57:45,577 Stage-1 map = 100%, reduce = 0% >>> 2010-04-01 16:57:48,605 Stage-1 map = 100%, reduce = 100% >>> Ended Job = job_201003031204_0083 >>> OK >>> 42219-04-22 00:05:55 tuesday >>> Time taken: 18.066 seconds >>> >>> >>> hive> describe ut; >>> OK >>> cola string >>> colb string >>> Time taken: 0.077 seconds >>> >>> >>> hive> select * from ut; >>> OK >>> 1270145333155 tuesday >>> Time taken: 0.065 seconds >>> >>> >>> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut; >>> FAILED: Error in semantic analysis: line 1:7 Function Argument Type >>> Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters >>> [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text] >>> >>> >>> >>> >>> >>> On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach <c...@cloudera.com>wrote: >>> >>>> Hi Tom, >>>> >>>> I think you want to use the from_unixtime UDF: >>>> >>>> hive> describe function extended from_unixtime; >>>> describe function extended from_unixtime; >>>> OK >>>> from_unixtime(unix_time, format) - returns unix_time in the specified >>>> format >>>> Example: >>>> > SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1; >>>> '1970-01-01 00:00:00' >>>> Time taken: 0.647 seconds >>>> hive> >>>> >>>> Thanks. >>>> >>>> Carl >>>> >>>> On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick <hiveu...@gmail.com>wrote: >>>> >>>>> hive> describe ut; >>>>> OK >>>>> time bigint >>>>> day string >>>>> Time taken: 0.128 seconds >>>>> hive> select * from ut; >>>>> OK >>>>> 1270145333155 tuesday >>>>> Time taken: 0.085 seconds >>>>> >>>>> When I run this simple query, I'm getting a NULL for the time column >>>>> with data type bigint. >>>>> >>>>> hive> select unix_timestamp(time),day from ut; >>>>> Total MapReduce jobs = 1 >>>>> Launching Job 1 out of 1 >>>>> Number of reduce tasks is set to 0 since there's no reduce operator >>>>> Starting Job = job_201003031204_0080, Tracking URL = >>>>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080 >>>>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job >>>>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080 >>>>> 2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0% >>>>> 2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0% >>>>> 2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100% >>>>> Ended Job = job_201003031204_0080 >>>>> OK >>>>> NULL tuesday >>>>> Time taken: 16.981 seconds >>>>> >>>>> Any ideas? >>>>> >>>>> Thanks! >>>>> >>>>> /tom >>>>> >>>> >>>> >>> >> >