[
https://issues.apache.org/jira/browse/DRILL-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Pearcy updated DRILL-1871:
-------------------------------
Attachment: browserlog.json.gz
Do a simple select all from this file:
select * from dfs.`/opt/drill/data/rawgzjson/browserlog`;
Ends up giving this exception in logs:
org.apache.drill.common.exceptions.DrillRuntimeException: Error parsing JSON. -
Parser was at record: 1 column: 8
at
org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:101)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at
org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:148)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
gunzip the file and the whole contents can be queried via select *
> JSON reader cannot read compressed files
> -----------------------------------------
>
> Key: DRILL-1871
> URL: https://issues.apache.org/jira/browse/DRILL-1871
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - JSON
> Reporter: Parth Chandra
> Assignee: Jason Altekruse
> Fix For: 0.8.0
>
> Attachments: 1871-17-dec-14.patch, DRILL-1871-compressed-json.patch,
> browserlog.json.gz
>
>
> GZip a json file and then try to query it. Gives the following error :
> ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: SELECT
> * FROM `dfs`.`part-m-0001.json.gz`
> [30024]Query execution error. Details:[
> Query stopped., Illegal character ((CTRL-CHAR, code 31)): only regular white
> space (\r, \n, \t) is allowed between tokens
> at [Source: org.apache.drill.exec.vector.complex.fn.JsonReader@6a375274;
> line: 0, column: 2] [ d83909cd-89b7-43a2-aebc-5ebba74570db on vmx0754:31010 ]
> ]
> The JSON reader is supposed to be able to handle compressed files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)