Hi, I am using this to load the apache log into Hadoop via Hive (my version is 0.4.1).
CREATE TABLE apache_log ( ... logdate STRING, ... ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) \\[(\\w+\/\\w+\/\\w+)\:(\\d+:\\d+:\\d+) ... ... The date is coming in this format: dd/mmm/yyyy. I would like to be able to load the data using this date format: yyyy-mmm-dd. 1. Has anyone done this before loading the date in a different a different format? 2. Also, how do you specify in the create table statement above that the partition is the logdate? 3. And when I tried to convert the old date into unixtime format via this sql, hive complains. hive> select from_unixtime( unix_timestamp( logdate, 'dd/MMM/yyyy')) from apache_log; FAILED: Error in semantic analysis: line 1:7 Function Argument Type Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters [class org.apache.hadoop.io.LongWritable] Has anyone encountered these issues before? Thanks.