Re: unix_timestamp function

Carl Steinbach Thu, 01 Apr 2010 14:14:06 -0700

Hi Tom,

"Unix Time" is defined as the number of *seconds* since January 1, 1970. It
looks like the data you have in cola is in milliseconds. You need to divide
this value by 1000 before calling from_unixtime() on the result.


Thanks.

Carl


On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick <[email protected]> wrote:

> Thanks, but there is something fishy going on.
>
> Im using hive 0.5.0 with hadoop 0.20.1
>
> I tried the column as both a bigint and a string.  According the hive ddl:
>
> string
>
> from_unixtime(int unixtime)
>
> Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to
> a string representing the timestamp of that moment in the current system
> time zone in the format of "1970-01-01 00:00:00"
>
> It looks like the input is int,  that would be too small for my
> 1270145333155 timestamp.
>
> Any ideas?
>
> Example below:
>
> /tom
>
>
> hive> describe ut;
> OK
> cola    bigint
> colb    string
> Time taken: 0.101 seconds
>
>
> hive> select * from ut;
> OK
> 1270145333155    tuesday
> Time taken: 0.065 seconds
>
> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
>
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201003031204_0083, Tracking URL =
> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
> 2010-04-01 16:57:32,407 Stage-1 map = 0%,  reduce = 0%
> 2010-04-01 16:57:45,577 Stage-1 map = 100%,  reduce = 0%
> 2010-04-01 16:57:48,605 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201003031204_0083
> OK
> 42219-04-22 00:05:55    tuesday
> Time taken: 18.066 seconds
>
>
> hive> describe ut;
> OK
> cola    string
> colb    string
> Time taken: 0.077 seconds
>
>
> hive> select * from ut;
> OK
> 1270145333155    tuesday
> Time taken: 0.065 seconds
>
>
> hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
> FAILED: Error in semantic analysis: line 1:7 Function Argument Type
> Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
> [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]
>
>
>
>
>
> On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach <[email protected]> wrote:
>
>> Hi Tom,
>>
>> I think you want to use the from_unixtime UDF:
>>
>> hive> describe function extended from_unixtime;
>> describe function extended from_unixtime;
>> OK
>> from_unixtime(unix_time, format) - returns unix_time in the specified
>> format
>> Example:
>>   > SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
>>   '1970-01-01 00:00:00'
>> Time taken: 0.647 seconds
>> hive>
>>
>> Thanks.
>>
>> Carl
>>
>> On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick <[email protected]> wrote:
>>
>>> hive> describe ut;
>>> OK
>>> time    bigint
>>> day    string
>>> Time taken: 0.128 seconds
>>> hive> select * from ut;
>>> OK
>>> 1270145333155    tuesday
>>> Time taken: 0.085 seconds
>>>
>>> When I run this simple query, I'm getting a NULL for the time column with
>>> data type bigint.
>>>
>>> hive> select unix_timestamp(time),day from ut;
>>> Total MapReduce jobs = 1
>>> Launching Job 1 out of 1
>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>> Starting Job = job_201003031204_0080, Tracking URL =
>>> http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
>>> Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
>>> -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
>>> 2010-04-01 16:03:54,024 Stage-1 map = 0%,  reduce = 0%
>>> 2010-04-01 16:04:06,128 Stage-1 map = 100%,  reduce = 0%
>>> 2010-04-01 16:04:09,150 Stage-1 map = 100%,  reduce = 100%
>>> Ended Job = job_201003031204_0080
>>> OK
>>> NULL    tuesday
>>> Time taken: 16.981 seconds
>>>
>>> Any ideas?
>>>
>>> Thanks!
>>>
>>> /tom
>>>
>>
>>
>

Re: unix_timestamp function

Reply via email to