it's a regex that fails when it sees an invalid line such as (/***************) tips on what can be done to fix this?
----- Original Message ----- From: Parag Arora <[email protected]> To: [email protected] Sent: Fri, 30 Jul 2010 23:42:28 -0700 (PDT) Subject: Re: Failed with exception java.io.IOException:java.lang.NullPointerException It seems that your serde output must have been null. On Sat, Jul 31, 2010 at 7:28 AM, Anurag Phadke <[email protected]> wrote: > We are importing hadoop logs inside hive, but are running in some issues. > Sample log lines: > 2010-02-25 14:27:18,000 INFO org.apache.hadoop.mapred.TaskTracker: > SHUTDOWN_MSG: > > Query: SELECT * FROM logs_temp; > runs fine for the above statement. > > However, for the log lines: > /************************************************************ > SHUTDOWN_MSG: Shutting down TaskTracker at > cm-hadoop01.mozilla.org/10.2.72.53 > ************************************************************/ > > Query: SELECT * FROM logs_temp; > Failed with exception java.io.IOException:java.lang.NullPointerException > > However, SELECT count(1) FROM logs_temp; > returns 3 rows, which is correct. > > Table structure given below: > add jar /usr/lib/hive/lib/hive_contrib.jar; > CREATE EXTERNAL TABLE logs_temp( > line_date STRING, > line_time STRING, > message_type STRING, > classname STRING, > message STRING > ) > > PARTITIONED BY (ds STRING, ts STRING, hn STRING) > > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > WITH SERDEPROPERTIES ( > "input.regex" = > > "^(\\d{4}(?>-\\d{2}){2})\\s((?>\\d{2}[:,]){3}\\d{3})\\s([A-Z]+)\\s([^:]+):\\s(.*)" > ) > STORED AS TEXTFILE; > > > > Any idea on what might be going wrong here? > > -Anurag > > -- Parag http://www.paragarora.com Phone: +91.8080350130
