Re: unix_timestamp function

tom kersnick Thu, 01 Apr 2010 16:37:44 -0700

Ok thanks!

I will try it out.....


/tom



On Thu, Apr 1, 2010 at 4:31 PM, Zheng Shao <[email protected]> wrote:

> Setting TZ in your .bash_profile won't work because the map/reduce tasks
> runs on the hadoop clusters.
> If you start your hadoop tasktracker with that TZ setting, it will probably
> work.
>
> Zheng
>
> On Thu, Apr 1, 2010 at 3:32 PM, tom kersnick <[email protected]> wrote:
>
>> So its working, but Im having a time zone issue.
>>
>> My servers are located in EST, but i need this data in PST.
>>
>> So when it converts this:
>>
>> hive> select from_unixtime(1270145333,'yyyy-MM-dd HH:mm:ss') from
>> ut2;
>>  Total MapReduce jobs = 1
>> Launching Job 1 out of 1
>> Number of reduce tasks is set to 0 since there's no reduce operator
>> Starting Job = job_201003031204_0102, Tracking URL =
>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0102
>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0102
>> 2010-04-01 18:28:23,041 Stage-1 map = 0%,  reduce = 0%
>> 2010-04-01 18:28:37,315 Stage-1 map = 67%,  reduce = 0%
>> 2010-04-01 18:28:43,386 Stage-1 map = 100%,  reduce = 0%
>> 2010-04-01 18:28:46,412 Stage-1 map = 100%,  reduce = 100%
>> Ended Job = job_201003031204_0102
>> OK
>> 2010-04-01 14:08:53
>> Time taken: 30.191 seconds
>>
>>
>> I need it to be :
>> 2010-04-01 11:08:53
>>
>>
>> I tried setting the variable in my .bash_profile  for TZ=/ /Americas/  =
>> no go.
>>
>> Nothing in the hive ddl link you is leading me in the right direction.  Is
>> there something you guys can recommend?  I can write a script outside of
>> hive, but it would be great if I can have users handle this within their
>> queries.
>>
>> Thanks in advance!
>>
>> /tom
>>
>>
>>
>>
>> On Thu, Apr 1, 2010 at 2:17 PM, tom kersnick <[email protected]> wrote:
>>
>>> ok thanks....
>>>
>>> I should have caught that.
>>>
>>> /tom
>>>
>>>
>>>
>>>
>>> On Thu, Apr 1, 2010 at 2:13 PM, Carl Steinbach <[email protected]>wrote:
>>>
>>>> Hi Tom,
>>>>
>>>> "Unix Time" is defined as the number of *seconds* since January 1, 1970.
>>>> It looks like the data you have in cola is in milliseconds. You need to
>>>> divide this value by 1000 before calling from_unixtime() on the result.
>>>>
>>>> Thanks.
>>>>
>>>> Carl
>>>>
>>>>
>>>> On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick <[email protected]>wrote:
>>>>
>>>>> Thanks, but there is something fishy going on.
>>>>>
>>>>> Im using hive 0.5.0 with hadoop 0.20.1
>>>>>
>>>>> I tried the column as both a bigint and a string.  According the hive
>>>>> ddl:
>>>>>
>>>>> string
>>>>>
>>>>> from_unixtime(int unixtime)
>>>>>
>>>>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00
>>>>> UTC) to a string representing the timestamp of that moment in the current
>>>>> system time zone in the format of "1970-01-01 00:00:00"
>>>>>
>>>>> It looks like the input is int,  that would be too small for my
>>>>> 1270145333155 timestamp.
>>>>>
>>>>> Any ideas?
>>>>>
>>>>> Example below:
>>>>>
>>>>> /tom
>>>>>
>>>>>
>>>>> hive> describe ut;
>>>>> OK
>>>>> cola    bigint
>>>>> colb    string
>>>>> Time taken: 0.101 seconds
>>>>>
>>>>>
>>>>> hive> select * from ut;
>>>>> OK
>>>>> 1270145333155    tuesday
>>>>> Time taken: 0.065 seconds
>>>>>
>>>>> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
>>>>>
>>>>> Total MapReduce jobs = 1
>>>>> Launching Job 1 out of 1
>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>> Starting Job = job_201003031204_0083, Tracking URL =
>>>>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
>>>>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
>>>>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
>>>>> 2010-04-01 16:57:32,407 Stage-1 map = 0%,  reduce = 0%
>>>>> 2010-04-01 16:57:45,577 Stage-1 map = 100%,  reduce = 0%
>>>>> 2010-04-01 16:57:48,605 Stage-1 map = 100%,  reduce = 100%
>>>>> Ended Job = job_201003031204_0083
>>>>> OK
>>>>> 42219-04-22 00:05:55    tuesday
>>>>> Time taken: 18.066 seconds
>>>>>
>>>>>
>>>>> hive> describe ut;
>>>>> OK
>>>>> cola    string
>>>>> colb    string
>>>>> Time taken: 0.077 seconds
>>>>>
>>>>>
>>>>> hive> select * from ut;
>>>>> OK
>>>>> 1270145333155    tuesday
>>>>> Time taken: 0.065 seconds
>>>>>
>>>>>
>>>>> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
>>>>> FAILED: Error in semantic analysis: line 1:7 Function Argument Type
>>>>> Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
>>>>> [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach <[email protected]>wrote:
>>>>>
>>>>>> Hi Tom,
>>>>>>
>>>>>> I think you want to use the from_unixtime UDF:
>>>>>>
>>>>>> hive> describe function extended from_unixtime;
>>>>>> describe function extended from_unixtime;
>>>>>> OK
>>>>>> from_unixtime(unix_time, format) - returns unix_time in the specified
>>>>>> format
>>>>>> Example:
>>>>>>   > SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
>>>>>>   '1970-01-01 00:00:00'
>>>>>> Time taken: 0.647 seconds
>>>>>> hive>
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Carl
>>>>>>
>>>>>> On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick <[email protected]>wrote:
>>>>>>
>>>>>>> hive> describe ut;
>>>>>>> OK
>>>>>>> time    bigint
>>>>>>> day    string
>>>>>>> Time taken: 0.128 seconds
>>>>>>> hive> select * from ut;
>>>>>>> OK
>>>>>>> 1270145333155    tuesday
>>>>>>> Time taken: 0.085 seconds
>>>>>>>
>>>>>>> When I run this simple query, I'm getting a NULL for the time column
>>>>>>> with data type bigint.
>>>>>>>
>>>>>>> hive> select unix_timestamp(time),day from ut;
>>>>>>>
>>>>>>> Total MapReduce jobs = 1
>>>>>>> Launching Job 1 out of 1
>>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>>> Starting Job = job_201003031204_0080, Tracking URL =
>>>>>>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
>>>>>>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
>>>>>>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
>>>>>>> 2010-04-01 16:03:54,024 Stage-1 map = 0%,  reduce = 0%
>>>>>>> 2010-04-01 16:04:06,128 Stage-1 map = 100%,  reduce = 0%
>>>>>>> 2010-04-01 16:04:09,150 Stage-1 map = 100%,  reduce = 100%
>>>>>>> Ended Job = job_201003031204_0080
>>>>>>> OK
>>>>>>> NULL    tuesday
>>>>>>> Time taken: 16.981 seconds
>>>>>>>
>>>>>>> Any ideas?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> /tom
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
> Yours,
> Zheng
>

Re: unix_timestamp function

Reply via email to