I think I'm jumping the gun, because I haven’t yet tried out your PR.
But to explain why I mentioned LogStash is because the primary challenge (IMO)
of creating a log file reader is that the format can be wildly different and
there is no standard format. So, what is needed is a good mechanism to consume
the logs with the right Regex feature. LogStash comes with a Grok parser that
does (IMHO) a fantastic job of parsing & tokenizing the logs.
The logback XML that I have for drill defines this format:
<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<encoder>
<pattern>%date{ISO8601} %property{HOSTNAME} [%thread] %-5level
%logger{36} - %msg%n</pattern>
</encoder>
</appender>
The one that comes default with Drill is
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} -
%msg%n</pattern>
</encoder>
</appender>
And
<appender name="FILE"
class="ch.qos.logback.core.rolling.RollingFileAppender">
<encoder>
<pattern>%date{ISO8601} [%thread] %-5level %logger{36} -
%msg%n</pattern>
</encoder>
</appender>
Notice how all three patterns are different.
A quick glance of the PR hints towards a fairly limited scope of log files that
can be processed (though I could be wrong).
A good way to test the log reader should be to simply look at the web UI's
http://<hostname>:8047/logs link and pick out those logs for processing/parsing.
I did stitch up something using ELK (ElasticSearch+LogStash+Kibana) to process
Drill logs, but that was back in 2015. If we can get something like that into a
storage plugin for Drill, that would probably go much farther. I could share
what I did back then and figure out a way to use that approach and libraries to
leverage this.
-----Original Message-----
From: Charles Givre [mailto:[email protected]]
Sent: Wednesday, February 07, 2018 1:08 PM
To: [email protected]
Subject: Re: Test cases for Drill-6104: Added Logfile Reader
Hi Kunal,
I just don’t know how to craft one with all the Drill internals. Is there an
example that I you can point me to?
> On Feb 7, 2018, at 18:38, Kunal Khatua <[email protected]> wrote:
>
> How about using the Drill logs as a use case?
>
> You have drillbit.out and drillbit_hostname.log to consume. It would be
> interesting to see how multiline log entries are handled.
>
> Logstash does an excellent job IMO, but that's more for parsing.
>
> -----Original Message-----
> From: Charles Givre [mailto:[email protected]]
> Sent: Wednesday, February 07, 2018 2:32 AM
> To: [email protected]
> Subject: Test cases for Drill-6104: Added Logfile Reader
>
> Hello all,
> I submitted this PR for a logfile parser for Drill
> (https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_drill_pull_1114&d=DwIFAg&c=cskdkSMqhcnjZxdQVpwTXg&r=-cT6otg6lpT_XkmYy7yg3A&m=oyYUEV4U-85UnHzphkWP57ikKiUPhdBpBw7F9HZGZZ4&s=rmM0FHOFV2_cyScnz1qtDz_zJpJjkPEB_2jT1WsujT0&e=
>
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_drill_pull_1114&d=DwIFAg&c=cskdkSMqhcnjZxdQVpwTXg&r=-cT6otg6lpT_XkmYy7yg3A&m=oyYUEV4U-85UnHzphkWP57ikKiUPhdBpBw7F9HZGZZ4&s=rmM0FHOFV2_cyScnz1qtDz_zJpJjkPEB_2jT1WsujT0&e=>)
> . I need to write unit tests for it, however I really have no idea how to
> do so. Could someone point me to an example or something so that the PR will
> pass the CI tests?
> TIA,
> - C
>
>
>