Hi All, I am trying to load database listener logs into hive tables. I am using Regex Serde from https://repository.cloudera.com/artifactory/public/org/apache/hive/hive-contrib/0.10.0-cdh4.2.0-SNAPSHOT/hive-contrib-0.10.0-cdh4.2.0.jar
Below is my create table: CREATE external TABLE ListenerLog_Regex ( logdate STRING, time STRING, service_name STRING, action STRING, method STRING, cid_program STRING, cid_host STRING,k cid_user STRING, server STRING, instance_name STRING, address_protocol STRING, address_host STRING, address_port STRING, status STRING, established_from STRING, return_code STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "(\\w+)\\s(\\d+)\\s\\(CONNECT_DATA=\\(SERVICE_NAME=(\\w+\\))\\s\\(failover_mode=\\(type=(\\w+\\))\\s(\\(method=\\w+\\)\\))\\s\\(CID=\\(PROGRAM=(\\w+\\))\\s\\(HOST=(\\w+\\))\\s\\(USER=(\\w+\\)\\))\\s\\(SERVER=(\\w+\\))\s\\(INSTANCE_NAME=(\\w+\\)\\))\s\*\s\\(ADDRESS=\\(PROTOCOL=(\\w+\\))\s\\(HOST=(\\w+\\))\\s\\(PORT=(\\w+\\)\\))\\s\*\\s(\\w+)\\s\*\\s(\\w+)\\s\*\\s(\d)", "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s %10$s %11$s %12$s %13$s %14$s %15$s %16s" ) LOCATION '/user/hive/warehouse/ListenerLogs'; Attached is the sample listener log. I am able to create the table, however, when I query the table, I get all nulls as the output. Clearly its a parsing error. I tried tweaking the input.regex quite a lot but same result, but may be its still not right. Please help. FYR.. Outputs below: hive> select logdate from listenerlog_regex > ; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201311071517_0059, Tracking URL = http://cloudera1.tms.toyota.com:50030/jobdetails.jsp?jobid=job_201311071517_0059 Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201311071517_0059 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2013-11-14 12:31:45,587 Stage-1 map = 0%, reduce = 0% 2013-11-14 12:32:23,175 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:24,247 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:25,303 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:26,376 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:27,429 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:28,483 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:29,560 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:30,762 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:31,829 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:32,896 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:33,929 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:34,970 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:36,020 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:37,087 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:38,120 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:39,185 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:40,232 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:41,267 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:42,336 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:43,395 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:44,439 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:45,490 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:46,527 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.49 sec 2013-11-14 12:32:47,580 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 9.49 sec MapReduce Total cumulative CPU time: 9 seconds 490 msec Ended Job = job_201311071517_0059 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 9.49 sec HDFS Read: 1001 HDFS Write: 9 SUCCESS Total MapReduce CPU Time Spent: 9 seconds 490 msec OK NULL NULL NULL Time taken: 78.955 seconds hive> select * from listenerlog_regex; OK NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL Time taken: 0.419 seconds
sample.log
Description: Binary data