Thanks Charles, I got it working now.

Thanks again for your great help.
IDoor

On Wed, Nov 14, 2018 at 11:49 AM Charles Givre <cgi...@gmail.com> wrote:

> Hi Idoor,
> The regex in the example is the pattern which matches MySQL logs. What you
> have to do is write a regex that maps your data and extracts the fields.
>  Just eyeballing it you might have something like:
> (\d{2} \w{3} \d{4}) (\d{2}:\d{2}:\d{2})\s\([.+?\])\s(.+?)\[(.+?)\]
>
> (Note: you’ll have to add an additional \ before every slash)
>
> This regex has 5 capturing groups, so you’ll need to define 5 fields in
> the schema section of the format plugin.  I would test this out on
> regexpal.com <http://regexpal.com/> or regex101.com <http://regex101.com/>
> and see if it works.
> —C
>
>
> > On Nov 14, 2018, at 11:40, idoor do <idoorla...@gmail.com> wrote:
> >
> > Hi Charles,
> >
> > Thanks for your help, now I got it working the mysql log file, but I have
> > issues with a different log file format like this:
> >
> > 01 Oct 2018 09:30:32 [ID# ] - Query Request [ Datasource : tydy ]
> >
> > So the eventDate is     01 Oct 2018
> > the eventTime is          09:30:32
> > the PID as string is       [ID# ]
> > action as string is       - Query Request
> > query as string is .      [ Datasource : tydy ]
> >
> > so how the log plugin knows the boundaries among all the neighboring
> > fields? right now I got
> > *Error: PARSE ERROR: Too many errors.  Max error threshold exceeded.*
> > in the sqline.log, it said Unmatached line: 01 Oct 2018 09:30:33 [ID# ] -
> > Query Request  [ Datasource : tydy ]
> >
> > The config I am using is as follows:
> >
> > "log": {
> >      "type": "logRegex",
> >      "regex":
> > "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)",
> >      "extension": "log",
> >      "maxErrors": 10,
> >      "schema": [
> >        {
> >          "fieldName": "eventDate",
> >          "fieldType": "DATE",
> >          "format": "dd MMM yyyy"
> >        },
> >        {
> >          "fieldName": "eventTime",
> >          "fieldType": "TIME",
> >          "format": "HH:mm:ss"
> >        },
> >        {
> >          "fieldName": "PID"
> >        },
> >        {
> >          "fieldName": "action"
> >        },
> >        {
> >          "fieldName": "query"
> >        }
> >      ]
> >    }
> >
> > Thanks very much for your help.
> > Idoor
> >
> > On Wed, Nov 14, 2018 at 11:01 AM Charles Givre <cgi...@gmail.com> wrote:
> >
> >> Hi idoor,
> >> For some reason the documentation for this is an old and incorrect
> >> version.  Here is a link to the correct documentation:
> >>
> >>
> >>
> https://github.com/cgivre/drill/blob/24556d857cbbe7aa2baa1fc6cbd85fb614b5d975/exec/java-exec/src/main/java/org/apache/drill/exec/store/log/README.md
> >> <
> >>
> https://github.com/cgivre/drill/blob/24556d857cbbe7aa2baa1fc6cbd85fb614b5d975/exec/java-exec/src/main/java/org/apache/drill/exec/store/log/README.md
> >>>
> >>
> >> It’s actually a lot easier…
> >> — C
> >>
> >>> On Nov 14, 2018, at 10:53, idoor do <idoorla...@gmail.com> wrote:
> >>>
> >>> Could somebody help me with this issue ? I have been stuck on this
> issue
> >>> for a couple of days.
> >>>
> >>>
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> I installed drill-logfile-plugin-1.0.0 JAR file to
> >>> <drill_install>/jars/3rdParty/ directory, and configured dfs as the
> >>> following, but I got error: "Please retry: error (invalid JSON
> mapping)",
> >>> in the sqlline.log file, it shows an error: Unable to find constructor
> >> for
> >>> storage config named 'log' of type
> >>> 'org.apache.drill.exec.store.log.LogFormatPlugin$LogFormatConfig, but I
> >>> double checked the drill-logfile-plugin-1.0.0.jar file is in the
> >>> jars/3rdParty folder:
> >>>
> >>> My config for dfs with log plugin suport is:
> >>> {
> >>> "type": "file",
> >>> "connection": "file:///",
> >>> "config": null,
> >>> "workspaces": {
> >>> "root": {
> >>> "location": "/",
> >>> "writable": false,
> >>> "defaultInputFormat": null,
> >>> "allowAccessOutsideWorkspace": false
> >>> },
> >>> "test": {
> >>> "location": "/Users/tsd",
> >>> "writable": false,
> >>> "defaultInputFormat": null,
> >>> "allowAccessOutsideWorkspace": false
> >>> },
> >>> "tmp": {
> >>> "location": "/tmp",
> >>> "writable": true,
> >>> "defaultInputFormat": null,
> >>> "allowAccessOutsideWorkspace": false
> >>> }
> >>> },
> >>> "formats": {
> >>> "log" : {
> >>> "type" : "log",
> >>> "extensions" : [ "log" ],
> >>> "fieldNames" : [ "date", "time", "pid", "action", "query" ],
> >>> "dataTypes" : [ "DATE", "TIME", "INT", "VARCHAR", "VARCHAR" ],
> >>> "dateFormat" : "yyMMdd",
> >>> "timeFormat" : "HH:mm:ss",
> >>> "pattern" :
> >> "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)",
> >>> "errorOnMismatch" : false
> >>> }
> >>> },
> >>> "enabled": true
> >>> }
> >>>
> >>> If I configured the log section as this to remove some fields, the
> error
> >>> will disappear, but some fields will be missing, and the query:
> >>>
> >>> select * from `mysql.log` limit 10; returns error: ERROR
> >>> o.a.calcite.runtime.CalciteException -
> >>> org.apache.calcite.sql.validate.SqlValidatorException: Object
> 'mysql.log'
> >>> not found
> >>>
> >>>
> >>> and when I type show files;, it shows the mysql.log file is in the
> >> /Users/tsd
> >>> directory:
> >>>
> >>>
> >>> "log": {
> >>> "type": "log",
> >>> "extensions": [
> >>> "log"
> >>> ],
> >>> "fieldNames": [
> >>> "date",
> >>> "time",
> >>> "pid",
> >>> "action",
> >>> "query"
> >>> ],
> >>> "pattern":
> "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)"
> >>> }
> >>
> >>
>
>

Reply via email to