[ 
https://issues.apache.org/jira/browse/DRILL-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264200#comment-14264200
 ] 

Abhishek Girish commented on DRILL-1750:
----------------------------------------

Drill 0.7 (Git.Commit.ID: 1608e43)

The attached files were hand-created to have one tuple each. They each have a 
common set of fields.

There are considerable issues when querying directories with JSON files (with 
varying schemas):

- In some cases, an java.lang.IndexOutOfBoundsException is thrown indicating an 
error "ERROR Client fs/client/fileclient/cc/client.cc:1007 Thread: 30398 Open 
failed for file /data/json/tmp, attempt to open a directory"

- Previously JVM crashes were observed (as shown in the description field 
above).

- When there is no failure, the returned fields are incomplete. And lately I 
observed them to be inconsistent in the sense, they depend on number of JSON 
files, number of records within them (which affects the limit clause applied to 
the query on directory) and also the schema of each file.

 



> Querying directories with JSON files returns incomplete results
> ---------------------------------------------------------------
>
>                 Key: DRILL-1750
>                 URL: https://issues.apache.org/jira/browse/DRILL-1750
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - JSON
>            Reporter: Abhishek Girish
>            Assignee: Abhishek Girish
>            Priority: Critical
>             Fix For: 0.8.0
>
>         Attachments: 1.json, 2.json, 3.json, 4.json
>
>
> I happened to observe that querying (select *) a directory with json files 
> displays only fields common to all json files. All corresponding fields are 
> displayed while querying each of the json files individually. And in some 
> scenarios, querying the directory crashes sqlline.
> The example below may help make the issue clear:
> > select * from dfs.`/data/json/tmp/1.json`;
> +------------+------------+------------+
> |   artist   |  track_id  |   title    |
> +------------+------------+------------+
> | Jonathan King | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA 
> Theme) |
> +------------+------------+------------+
> 1 row selected (1.305 seconds)
> > select * from dfs.`/data/json/tmp/2.json`;
> +------------+------------+------------+------------+
> |   artist   | timestamp  |  track_id  |   title    |
> +------------+------------+------------+------------+
> | Supersuckers | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double 
> Wide |
> +------------+------------+------------+------------+
> 1 row selected (0.105 seconds)
> > select * from dfs.`/data/json/tmp/3.json`;
> +------------+------------+------------+
> | timestamp  |  track_id  |   title    |
> +------------+------------+------------+
> | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
> +------------+------------+------------+
> 1 row selected (0.083 seconds)
> > select * from dfs.`/data/json/tmp/4.json`;
> +------------+------------+
> |  track_id  |   title    |
> +------------+------------+
> | TRAAAQN128F9353BA0 | Double Wide |
> +------------+------------+
> 1 row selected (0.076 seconds)
> > select * from dfs.`/data/json/tmp`;
> +------------+------------+
> |  track_id  |   title    |
> +------------+------------+
> | TRAAAQN128F9353BA0 | Double Wide |
> | TRAAAQN128F9353BA0 | Double Wide |
> | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) |
> | TRAAAQN128F9353BA0 | Double Wide |
> +------------+------------+
> 4 rows selected (0.121 seconds)
> JVM Crash occurs at times:
> > select * from dfs.`/data/json/tmp`;
> +------------+------------+------------+
> | timestamp  |  track_id  |   title    |
> +------------+------------+------------+
> | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f3cb99be053, pid=13943, tid=139898808436480
> #
> # JRE version: OpenJDK Runtime Environment (7.0_65-b17) (build 
> 1.7.0_65-mockbuild_2014_07_16_06_06-b00)
> # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
> compressed oops)
> # Problematic frame:
> # V  [libjvm.so+0x932053]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /tmp/jvm-13943/hs_error.log
> #
> # If you would like to submit a bug report, please include
> # instructions on how to reproduce the bug and visit:
> #   http://icedtea.classpath.org/bugzilla
> #
> Aborted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to