Re: unix_timestamp function

tom kersnick Thu, 01 Apr 2010 15:32:52 -0700

So its working, but Im having a time zone issue.

My servers are located in EST, but i need this data in PST.


So when it converts this:

hive> select from_unixtime(1270145333,'yyyy-MM-dd HH:mm:ss') from
ut2;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201003031204_0102, Tracking URL =
http://master:50030/jobdetails.jsp?jobid=job_201003031204_0102
Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
-Dmapred.job.tracker=master:54311 -kill job_201003031204_0102
2010-04-01 18:28:23,041 Stage-1 map = 0%,  reduce = 0%
2010-04-01 18:28:37,315 Stage-1 map = 67%,  reduce = 0%
2010-04-01 18:28:43,386 Stage-1 map = 100%,  reduce = 0%
2010-04-01 18:28:46,412 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201003031204_0102
OK
2010-04-01 14:08:53
Time taken: 30.191 seconds


I need it to be :
2010-04-01 11:08:53


I tried setting the variable in my .bash_profile  for TZ=/ /Americas/  = no
go.

Nothing in the hive ddl link you is leading me in the right direction.  Is
there something you guys can recommend?  I can write a script outside of
hive, but it would be great if I can have users handle this within their
queries.

Thanks in advance!

/tom



On Thu, Apr 1, 2010 at 2:17 PM, tom kersnick <hiveu...@gmail.com> wrote:

> ok thanks....
>
> I should have caught that.
>
> /tom
>
>
>
>
> On Thu, Apr 1, 2010 at 2:13 PM, Carl Steinbach <c...@cloudera.com> wrote:
>
>> Hi Tom,
>>
>> "Unix Time" is defined as the number of *seconds* since January 1, 1970.
>> It looks like the data you have in cola is in milliseconds. You need to
>> divide this value by 1000 before calling from_unixtime() on the result.
>>
>> Thanks.
>>
>> Carl
>>
>>
>> On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick <hiveu...@gmail.com> wrote:
>>
>>> Thanks, but there is something fishy going on.
>>>
>>> Im using hive 0.5.0 with hadoop 0.20.1
>>>
>>> I tried the column as both a bigint and a string.  According the hive
>>> ddl:
>>>
>>> string
>>>
>>> from_unixtime(int unixtime)
>>>
>>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC)
>>> to a string representing the timestamp of that moment in the current system
>>> time zone in the format of "1970-01-01 00:00:00"
>>>
>>> It looks like the input is int,  that would be too small for my
>>> 1270145333155 timestamp.
>>>
>>> Any ideas?
>>>
>>> Example below:
>>>
>>> /tom
>>>
>>>
>>> hive> describe ut;
>>> OK
>>> cola    bigint
>>> colb    string
>>> Time taken: 0.101 seconds
>>>
>>>
>>> hive> select * from ut;
>>> OK
>>> 1270145333155    tuesday
>>> Time taken: 0.065 seconds
>>>
>>> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
>>>
>>> Total MapReduce jobs = 1
>>> Launching Job 1 out of 1
>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>> Starting Job = job_201003031204_0083, Tracking URL =
>>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
>>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
>>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
>>> 2010-04-01 16:57:32,407 Stage-1 map = 0%,  reduce = 0%
>>> 2010-04-01 16:57:45,577 Stage-1 map = 100%,  reduce = 0%
>>> 2010-04-01 16:57:48,605 Stage-1 map = 100%,  reduce = 100%
>>> Ended Job = job_201003031204_0083
>>> OK
>>> 42219-04-22 00:05:55    tuesday
>>> Time taken: 18.066 seconds
>>>
>>>
>>> hive> describe ut;
>>> OK
>>> cola    string
>>> colb    string
>>> Time taken: 0.077 seconds
>>>
>>>
>>> hive> select * from ut;
>>> OK
>>> 1270145333155    tuesday
>>> Time taken: 0.065 seconds
>>>
>>>
>>> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
>>> FAILED: Error in semantic analysis: line 1:7 Function Argument Type
>>> Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
>>> [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach <c...@cloudera.com>wrote:
>>>
>>>> Hi Tom,
>>>>
>>>> I think you want to use the from_unixtime UDF:
>>>>
>>>> hive> describe function extended from_unixtime;
>>>> describe function extended from_unixtime;
>>>> OK
>>>> from_unixtime(unix_time, format) - returns unix_time in the specified
>>>> format
>>>> Example:
>>>>   > SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
>>>>   '1970-01-01 00:00:00'
>>>> Time taken: 0.647 seconds
>>>> hive>
>>>>
>>>> Thanks.
>>>>
>>>> Carl
>>>>
>>>> On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick <hiveu...@gmail.com>wrote:
>>>>
>>>>> hive> describe ut;
>>>>> OK
>>>>> time    bigint
>>>>> day    string
>>>>> Time taken: 0.128 seconds
>>>>> hive> select * from ut;
>>>>> OK
>>>>> 1270145333155    tuesday
>>>>> Time taken: 0.085 seconds
>>>>>
>>>>> When I run this simple query, I'm getting a NULL for the time column
>>>>> with data type bigint.
>>>>>
>>>>> hive> select unix_timestamp(time),day from ut;
>>>>> Total MapReduce jobs = 1
>>>>> Launching Job 1 out of 1
>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>> Starting Job = job_201003031204_0080, Tracking URL =
>>>>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
>>>>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
>>>>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
>>>>> 2010-04-01 16:03:54,024 Stage-1 map = 0%,  reduce = 0%
>>>>> 2010-04-01 16:04:06,128 Stage-1 map = 100%,  reduce = 0%
>>>>> 2010-04-01 16:04:09,150 Stage-1 map = 100%,  reduce = 100%
>>>>> Ended Job = job_201003031204_0080
>>>>> OK
>>>>> NULL    tuesday
>>>>> Time taken: 16.981 seconds
>>>>>
>>>>> Any ideas?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> /tom
>>>>>
>>>>
>>>>
>>>
>>
>

Re: unix_timestamp function

Reply via email to