I had a similar problem though my logs were terminated with carriage return. Many of the fields in my logs are deliminated with a space. We tried using \s but that basically removed every instance of the letter s (yeah I thought that was amusing too). In some cases we were able to do a \\t but that didn't seem to work with our logs very well. We are using the regex SerDe and using a regex deliminator we hand built to make it work. So far so good. Perhaps this is where you need to go. I'm still learning how that works myself. Exciting Stuff!!


On 04/04/2011 03:50 AM, Bjørn Remseth wrote:
Hi guys

I'm having a problem:  I'm reading a file where fields are terminated
by space (' ', ascii 32) into a table.  I'm not making these files
so I can't easily change this use of ' ' as field separator.

DROP TABLE logdata;

CREATE EXTERNAL TABLE logdata(
       xxx STRING,
       yyy STRING,
       ...
       z_t)
   ROW FORMAT DELIMITED
   FIELDS TERMINATED BY ' '
   STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '/somewhere/over/the/rainbow.dta' OVERWRITE INTO
TABLE logdata;


This fails: All the data is read into the first field (xxx).  If I
change the field separator to something else, e.g. "," things work
normally and I get to read the fields into their proper places
in the record, but then I have to edit the datafiles first and I don't
really want to do that.

Do you know how I can most easily read my logfiles?

Bjørn




Reply via email to