[ 
https://issues.apache.org/jira/browse/PIG-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated PIG-4639:
------------------------------
    Description: 
Currently there are two parsers for Apache HTTPD acces log files in piggybank 
that only allow parsing the 'combined' and 'common' logformats. These two also 
only parse the 'basics'.

This is proposed patch to add the existing 
https://github.com/nielsbasjes/logparser (Apache 2.0 license) as an 'out of the 
box' parser to piggybank. 
This parser parses the logfile using the LogFormat specification used to writte 
it. Almost all LogFormat specifiers are supported and as such adds easy parsing 
capabilities for (almost) all custom logformats used in production scenarios. 
This parser also goes much deeper in the sense that it allows extracting things 
like the value of a cookie or the value of a  query string parameter.

  was:
Currently there are two parsers for Apache Logfiles in piggybank that only 
allow parsing the 'combined' and 'common' logformats. These two also only parse 
the 'basics'.

This is proposed patch to add the existing 
https://github.com/nielsbasjes/logparser (Apache 2.0 license) as an 'out of the 
box' parser to piggybank that supports (almost) all LogFormat specifiers and as 
such adds parsing capabilities for (almost) all custom logformats used in 
production scenarios. 
This parser also goes much deeper in the sense that it allows extracting things 
like the value of a cookie or the value of a  query string parameter.


> Add better parser for Apache HTTPD access log.
> ----------------------------------------------
>
>                 Key: PIG-4639
>                 URL: https://issues.apache.org/jira/browse/PIG-4639
>             Project: Pig
>          Issue Type: New Feature
>          Components: piggybank
>    Affects Versions: 0.15.0
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>             Fix For: 0.16.0
>
>
> Currently there are two parsers for Apache HTTPD acces log files in piggybank 
> that only allow parsing the 'combined' and 'common' logformats. These two 
> also only parse the 'basics'.
> This is proposed patch to add the existing 
> https://github.com/nielsbasjes/logparser (Apache 2.0 license) as an 'out of 
> the box' parser to piggybank. 
> This parser parses the logfile using the LogFormat specification used to 
> writte it. Almost all LogFormat specifiers are supported and as such adds 
> easy parsing capabilities for (almost) all custom logformats used in 
> production scenarios. 
> This parser also goes much deeper in the sense that it allows extracting 
> things like the value of a cookie or the value of a  query string parameter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to