[ https://issues.apache.org/jira/browse/DRILL-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Charles Givre resolved DRILL-4955. ---------------------------------- Resolution: Resolved > Log Parser for Drill > -------------------- > > Key: DRILL-4955 > URL: https://issues.apache.org/jira/browse/DRILL-4955 > Project: Apache Drill > Issue Type: New Feature > Components: Storage - Text & CSV > Affects Versions: 1.9.0 > Reporter: Charles Givre > Priority: Major > Labels: features > Fix For: Future > > > I've been experimenting with a generic log parser for Drill. The basic > concept is that if you wanted Drill to ingest log files such as this MySQL > log: > {code} > 070823 21:00:32 1 Connect root@localhost on test1 > 070823 21:00:48 1 Query show tables > 070823 21:00:56 1 Query select * from category > 070917 16:29:01 21 Query select * from location > 070917 16:29:12 21 Query select * from location where id = 1 LIMIT > 1 > {code} > You probably could do it with the various string manipulation methods such as > split, substring etc. but you'd end up with some ugly and very complex > queries. > The extension I've built allows you to supply Drill with a regex for the > formatting and a list of fields as shown below. > {code} > "log": { > "type": "log", > "extensions": [ > "log" > ], > "fieldNames": [ > "date", > "time", > "pid", > "action", > "query" > ], > "pattern": > "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)" > } > {code} > You can then query this log files in this format in Drill. I'd like to > submit this for inclusion in Drill if there is interest. -- This message was sent by Atlassian Jira (v8.3.4#803005)