Can someone please suggest if this is doable or not? Is generic udf the
only option? How would using generic vs simple udf make any difference
since I would be returning the same object either ways.

Thank you
Sunita

---------- Forwarded message ----------
From: *Sunita Arvind* <sunitarv...@gmail.com>
Date: Wednesday, January 29, 2014
Subject: Simple UDF to return array
To: "user@hive.apache.org" <user@hive.apache.org>


Hello Experts,

I am trying to write a UDF to parse a logline and provide the output in the
form of an array. Basically I want to be able to use LATERAL VIEW explode
subsequently to make it into columns.

This is how a typical log entry looks:

24-JUN-2012 05:00:42 *
(CONNECT_DATA=(SERVICE_NAME=abcd.efg.hij.com)(failover_mode=(type=select)(method=basic))(CID=(PROGRAM=sqlplus)(HOST=xyz)(USER=u1))(SERVER=dedicated)(INSTANCE_NAME=aaa))
* (ADDRESS=(PROTOCOL=tcp)(HOST=9.9.9.9)(PORT=60000)) * establish *
abcd.efg.hij.com * 0

Attached is my LogParser class which is basically the UDF. Excerpts below:

class LogParser extends UDF {
  int current_index=0;

  ArrayList<String> record= new ArrayList<>();
  public ArrayList<String> evaluate(Text input) {
......
String  logdate = null;
...
logdate = getDate(line);
record.add(logdate);
return record;


I've tried changing the return type to ArrayList<Text>, Object etc.I just
get an error like this when I try to use the UDF:

select explode(strparse(record)) as newcols from logdump limit 1;

OK converting to local hdfs://tlbd-ns/user/TestUser1/LogParserStrArr.jar
Added
/tmp/3c583384-0592-41a3-ad9e-b12d2207df7b_resources/LogParserStrArr.jar to
class path Added resource:
/tmp/3c583384-0592-41a3-ad9e-b12d2207df7b_resources/LogParserStrArr.jar OK
FAILED: UDFArgumentException explode() takes an array or a map as a
parameter

I tried cast to array and that fails as well.

Requesting help from the community. I am considering writing generic UDF,
but this is a simple requirement and would like to be able to use simple
UDF if I can.

regards
Sunita

Attachment: LogParser.java
Description: Binary data

Reply via email to