[jira] [Updated] (KUDU-2434) Improve kudu-log-parser.pl

Grant Henke (Jira) Tue, 02 Jun 2020 19:52:47 -0700


     [ 
https://issues.apache.org/jira/browse/KUDU-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Grant Henke updated KUDU-2434:
------------------------------
    Target Version/s:   (was: 1.8.0)

> Improve kudu-log-parser.pl
> --------------------------
>
>                 Key: KUDU-2434
>                 URL: https://issues.apache.org/jira/browse/KUDU-2434
>             Project: Kudu
>          Issue Type: Improvement
>          Components: supportability
>    Affects Versions: 1.7.0
>            Reporter: William Berkeley
>            Assignee: William Berkeley
>            Priority: Major
>
> cc4e3957ba29bb42112dc21bfa8242e3f7afeac6 introduced the kudu-log-parser.pl 
> script, which takes a collection of possibly-gzipped Kudu logs, categorizes 
> and extracts information from some events in the logs using regexes, and then 
> sorted-merges all the logs together. It can be pretty useful for looking at 
> problems in a Kudu cluster ex post facto, especially when the exact timeframe 
> or cause is not known.
> There's a number of things that can be done to make the script better, 
> including:
> 1. Eliminating or disambiguating some false matches, e.g. "Time spent" is a 
> prefix matched on that applies both to slow execution logging and to LBM 
> startup messages.
> 2. Parallelizing the processing. In my experience, the script can take 30 
> minutes to munch a 12-node cluster's logs if the logs are 100-200MB in size.
> 3. Mike wrote the script to look at a cluster with consensus issues, so most 
> of the categorization if focused on those types of logs. We cold generalize 
> it to more types, and also allow filtering based on types.
> 4. The script is written in Perl. While that language is dear to Mike, most 
> Kudu developers would be more comfortable using and tweaking the script if it 
> were written in a more widely-known language like Python. Of course, Cython 
> doesn't support parallelism, so maybe something like Scala? That has more 
> unusual prerequisites, but it's Java-like and can be run as a script.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KUDU-2434) Improve kudu-log-parser.pl

Reply via email to