[ https://issues.apache.org/jira/browse/DRILL-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Rogers closed DRILL-5105. ------------------------------ > Query time increases exponentially with increasing nested levels > ---------------------------------------------------------------- > > Key: DRILL-5105 > URL: https://issues.apache.org/jira/browse/DRILL-5105 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON > Affects Versions: 1.9.0 > Environment: 3 Node Cluster with default memory and configurations. > Reporter: Abhishek Girish > Assignee: Chunhui Shi > Labels: ready-to-commit > > The time taken to query any JSON dataset depends on number of nested levels > within the dataset. Also, increasing the complexity of the dataset further > impacts the execution time. > Tabulated below is cached query execution times for a simple select * query > over two simple forms of JSON datasets: > || No. Levels || Time (s) Dataset 1 || Time (s) Dataset 2 || > |1 |0.22 |0.27 > | > |2 |0.23 |0.25 > | > |4 |0.24 |0.22 > | > |8 |0.22 |0.23 > | > |16 |0.34 |0.48 > | > |24 |25.76 |72.51 > | > |26 |103.48 |289.6 > | > |28 |336.12 |1151.94 > | > |30 |1342.22 |4586.79 | > |32 |5360.2 |Expected: ~20k | > The above table lists query times for 20 different JSON files, 10 belonging > to dataset 1 & 10 belonging to dataset 2. Each have 1 record, but the number > of nested levels within them vary as mentioned in the "No. Levels" column. > It appears that the query time almost doubles with addition of a nested level > (note that in the table above, it translates to almost 4x across levels > starting 24) > The below two are the representative datasets, showcasing simple JSON > structures with nested levels. > Structure of Dataset 1: > {code} > { > "level1": { > "field1": "a", > "level2": { > "field1"": "b", > ... > } > } > } > {code} > Structure of Dataset 2: > {code} > "{ > "level1": { > "field1": ""a", > "field2": { > "nfield1": true, > "nfield2": 1.1 > }, > "level2": { > "field1": "b", > "field2": { > "nfield1": false, > "nfield2": 2.2 > }, > ... > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)