Hi,

I have complex log files (compressed ".gz", 200G) on HDFS.

+ log file format :
127.0.0.1 [2012Avg08] "a=abc&b=adf&c=aadfad"

I think DDL)),
CREATE TABLE log_tb (ip STRING, dt STRING, kv Map<STRING, STRING>)
ROW FORMAT SERDE "??"
STORED AS SEQUENCEFILE;

I want the results below.
SELECT kv['b']
FROM log_tb
LIMIT 10;


1) How do I parsing to Complex log file (compressed(".gz", 200G)

2) If I have to SerDe, what SerDe should I use?

3) Does existed SerDe(input/output) by user define class?

4) If I use to partition with log file, how use to DDL, DML?..plz. sample
sql (DDL, DML)


Thanks.

Reply via email to