If there is any reason that the apache log format cannot be changed
(for example, Hive is not the only consumer)

You might want to try to use the RegexSerDe that is added to Hive
several days ago:
http://issues.apache.org/jira/browse/HIVE-662

Zheng

On Sat, Jul 25, 2009 at 6:37 AM, Saurabh Nanda<[email protected]> wrote:
>
>> Assuming you are only processing logs moving forward from such a config
>> change (and no other process is consuming them, and change mgmt not so hard,
>> ...) you can replace the legacy combined log format with something where
>> every field is well broken out and tab is the only necessary delimter e.g.
>
> After I manage to load previous log files into the Hadoop cluster, that's
> precisely what I'm going to do -- change the log format to be tab delimited.
>
> Saurabh.
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>



-- 
Yours,
Zheng

Reply via email to