Hi, I am trying to use this:
add jar ../build/contrib/hive_contrib.jar; CREATE TABLE apachelog ( host STRING, identity STRING, user STRING, time STRING, request STRING, status STRING, size STRING, referer STRING, agent STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\".*\") ([^ \"]*|\".*\"))?", "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s" ) STORED AS TEXTFILE; And it works great. My problem is how query the table with respect to the time column since it still has this format: [01/Mar/2011:13:01:10 -0700] So, I do not know how to execute these kinds of queries: 1. select time, count(*) from weblog where time < '01/Mar/2010'; 2. select hours, count(*) from weblog group by hours;