[
https://issues.apache.org/jira/browse/PIG-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639486#comment-14639486
]
Niels Basjes commented on PIG-4639:
-----------------------------------
The unit tests in the patch I'm working on require the patch I attached with
PIG-4405 to be present.
> Add better parser for Apache HTTPD access log.
> ----------------------------------------------
>
> Key: PIG-4639
> URL: https://issues.apache.org/jira/browse/PIG-4639
> Project: Pig
> Issue Type: Improvement
> Components: piggybank
> Affects Versions: 0.15.0
> Reporter: Niels Basjes
> Assignee: Niels Basjes
> Fix For: 0.16.0
>
>
> Currently there are two parsers for Apache Logfiles in piggybank that only
> allow parsing the 'combined' and 'common' logformats. These two also only
> parse the 'basics'.
> This is proposed patch to add the existing
> https://github.com/nielsbasjes/logparser (Apache 2.0 license) as an 'out of
> the box' parser to piggybank that supports (almost) all LogFormat specifiers
> and as such adds parsing capabilities for (almost) all custom logformats used
> in production scenarios.
> This parser also goes much deeper in the sense that it allows extracting
> things like the value of a cookie or the value of a query string parameter.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)