[
https://issues.apache.org/jira/browse/OOZIE-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470108#comment-16470108
]
Hadoop QA commented on OOZIE-3249:
----------------------------------
PreCommit-OOZIE-Build started
> [tools] Instrumentation log parser
> ----------------------------------
>
> Key: OOZIE-3249
> URL: https://issues.apache.org/jira/browse/OOZIE-3249
> Project: Oozie
> Issue Type: Improvement
> Components: tools
> Affects Versions: 5.0.0
> Reporter: Andras Piros
> Assignee: Andras Piros
> Priority: Major
> Attachments: OOZIE-3249.001.patch,
> oozie-instrumentation-localhost.log.2018-05-09,
> oozie-instrumentation-localhost.log.2018-05-09.out
>
>
> Oozie instrumentation logs contain a lot of information, but are difficult to
> parse, because per instrumentation log entry there is always one header line
> in plain text format (containing timestamp), and multiple other lines in JSON
> format (not containing timestamp). Those lines of course belong together.
> {noformat}
> 2018-05-02 02:48:13,426 INFO oozieinstrumentation:520 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[-] ACTION[-]
> {
> ...
> "counters" : {
> ...
> "callablequeue.executed" : {
> "count" : 5954144
> },
> ...
> "callablequeue.queued" : {
> "count" : 10596129
> },
> ...
> },
> ...
> }
> {noformat}
> There should be a simple script in {{tools/bin}} that takes as parameters:
> * input file name ({{-i}}), e.g. {{-i /path/to/oozie-instrumentation.log}}
> * output file name ({{-o}}), e.g. {{-o
> /path/to/oozie-instrumentation.log.out}}
> * parameters to extract ({{-p}}) in the format of
> {{path/to/json/value1,path/to/json/value2}}, in this case {{-p
> counters/callablequeue/executed/count,counters/callablequeue/queued/count}}
> The output file should contain in CSV format:
> * a header line containing column names for
> * one line per parsed input header / JSON lines, containing:
> ** first cell is the minutes part of the timestamp
> ** consecutive cells are parsed JSON values given each parameter to extract
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)