[ 
https://issues.apache.org/jira/browse/DRILL-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065784#comment-14065784
 ] 

Aditya Kishore commented on DRILL-1154:
---------------------------------------

Glad to hear that you were able to run the select query against the log files.

Please note that the default implementation of the txt reader uses delimited 
(CSV/TSV/xSV) parsing which may not be directly very meaningful for free form 
logs. It treats each line as a single column of type REPEATED VARCHAR and 
individual values(cells) in that column can be accessed using a numeric index 
with the column name.

For example, you can select first two values in this column using the following 
query:

{{SELECT columns\[0\], columns\[1\] FROM 
dfs.`/drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log` limit 
1;}}
or
{{SELECT cast(columns\[0\] as date) as logdate, cast(columns\[1\] as time) as 
logtime FROM 
dfs.`/drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log` limit 
1;}}

However, you should be able to build a specialized file format plugin by 
extending 
[EasyFormatPlugin|https://github.com/apache/incubator-drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java]
 and some of its auxiliary classes which would be able to do a more intelligent 
parsing of each log line be correctly demarcating individual log fields.

> Query log file failed
> ---------------------
>
>                 Key: DRILL-1154
>                 URL: https://issues.apache.org/jira/browse/DRILL-1154
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text & CSV
>    Affects Versions: 1.0.0
>         Environment: ubuntu 
>            Reporter: Jiaojiao Song
>            Priority: Critical
>
> Failed when I query a text file (.txt/ .log). But when I change the suffix of 
> this file to '.tsv' It works.  I hear a talk and Drill claim can support log 
> files. So I tried this and find It failed on both .txt and .log.  Is the 
> version I'm using too old?  (apache-drill-1.0.0-m2-incubating-SNAPSHOT)
> Error messages:
> 0: jdbc:drill:zk=local> SELECT * FROM 
> dfs.logs.`/drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log`;
> Jul 16, 2014 9:08:08 PM org.eigenbase.sql.validate.SqlValidatorException 
> <init>
> SEVERE: org.eigenbase.sql.validate.SqlValidatorException: Table 
> 'dfs.logs./drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log' 
> not found
> Jul 16, 2014 9:08:08 PM org.eigenbase.util.EigenbaseException <init>
> SEVERE: org.eigenbase.util.EigenbaseContextException: From line 1, column 15 
> to line 1, column 17
> Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while 
> running query.[error_id: "fe18b830-2ed7-447d-b4be-5e340e4aa488"
> endpoint {
>   address: "building-1s-dhcp69.eng.haha.com"
>   user_port: 31010
>   control_port: 31011
>   data_port: 31012
> }
> error_type: 0
> message: "Failure while parsing sql. < ValidationException:[ 
> org.eigenbase.util.EigenbaseContextException: From line 1, column 15 to line 
> 1, column 17 ] < EigenbaseContextException:[ From line 1, column 15 to line 
> 1, column 17 ] < SqlValidatorException:[ Table 
> 'dfs.logs./drill/apache-drill-1.0.0-m2-incubating-SNAPSHOT/log/sqlline.log' 
> not found ]"
> ]
> Error: exception while executing query (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to