[
https://issues.apache.org/jira/browse/DRILL-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065784#comment-14065784
]
Aditya Kishore commented on DRILL-1154:
---------------------------------------
Glad to hear that you were able to run the select query against the log files.
Please note that the default implementation of the txt reader uses delimited
(CSV/TSV/xSV) parsing which may not be directly very meaningful for free form
logs. It treats each line as a single column of type REPEATED VARCHAR and
individual values(cells) in that column can be accessed using a numeric index
with the column name.
For example, you can select first two values in this column using the following
query:
{{SELECT columns\[0\], columns\[1\] FROM
dfs.`/drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log` limit
1;}}
or
{{SELECT cast(columns\[0\] as date) as logdate, cast(columns\[1\] as time) as
logtime FROM
dfs.`/drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log` limit
1;}}
However, you should be able to build a specialized file format plugin by
extending
[EasyFormatPlugin|https://github.com/apache/incubator-drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java]
and some of its auxiliary classes which would be able to do a more intelligent
parsing of each log line be correctly demarcating individual log fields.
> Query log file failed
> ---------------------
>
> Key: DRILL-1154
> URL: https://issues.apache.org/jira/browse/DRILL-1154
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Text & CSV
> Affects Versions: 1.0.0
> Environment: ubuntu
> Reporter: Jiaojiao Song
> Priority: Critical
>
> Failed when I query a text file (.txt/ .log). But when I change the suffix of
> this file to '.tsv' It works. I hear a talk and Drill claim can support log
> files. So I tried this and find It failed on both .txt and .log. Is the
> version I'm using too old? (apache-drill-1.0.0-m2-incubating-SNAPSHOT)
> Error messages:
> 0: jdbc:drill:zk=local> SELECT * FROM
> dfs.logs.`/drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log`;
> Jul 16, 2014 9:08:08 PM org.eigenbase.sql.validate.SqlValidatorException
> <init>
> SEVERE: org.eigenbase.sql.validate.SqlValidatorException: Table
> 'dfs.logs./drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log'
> not found
> Jul 16, 2014 9:08:08 PM org.eigenbase.util.EigenbaseException <init>
> SEVERE: org.eigenbase.util.EigenbaseContextException: From line 1, column 15
> to line 1, column 17
> Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while
> running query.[error_id: "fe18b830-2ed7-447d-b4be-5e340e4aa488"
> endpoint {
> address: "building-1s-dhcp69.eng.haha.com"
> user_port: 31010
> control_port: 31011
> data_port: 31012
> }
> error_type: 0
> message: "Failure while parsing sql. < ValidationException:[
> org.eigenbase.util.EigenbaseContextException: From line 1, column 15 to line
> 1, column 17 ] < EigenbaseContextException:[ From line 1, column 15 to line
> 1, column 17 ] < SqlValidatorException:[ Table
> 'dfs.logs./drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log'
> not found ]"
> ]
> Error: exception while executing query (state=,code=0)
--
This message was sent by Atlassian JIRA
(v6.2#6252)