[
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prasad Nagaraj Subramanya updated DRILL-5822:
---------------------------------------------
Description:
Repro steps
1) Have multiple json files in a directory having the same schema
2) Also have one or more empty files
Scenarios
1) Only one minor fragment{code}select * from dfs.`/json_dir`;{code}
{code}Result:
+----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
| row_key | p_partkey | p_name | p_mfgr
| p_brand | p_type | p_size | p_container |
p_retailprice | p_comment |
+----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
| 1 | 1 | goldenrod lace spring peru powder |
Manufacturer#1 | Brand#13 | PROMO BURNISHED COPPER | 7 | JUMBO PKG
| 901.0 | ly. slyly ironi |
| 2 | 2 | blush rosy metallic lemon navajo |
Manufacturer#1 | Brand#13 | LARGE BRUSHED BRASS | 1 | LG CASE
| 902.0 | lar accounts amo |
{code}
2) One minor fragment per file
{code}alter session set `planner.slice_target`=1;
select * from dfs.`/json_dir`;{code}
Result:
{code}
+-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
| p_brand | p_comment | p_container | p_mfgr |
p_name | p_partkey | p_retailprice | p_size |
p_type | row_key |
+-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
| Brand#13 | ly. slyly ironi | JUMBO PKG | Manufacturer#1 |
goldenrod lace spring peru powder | 1 | 901.0 | 7
| PROMO BURNISHED COPPER | 1 |
| Brand#13 | lar accounts amo | LG CASE | Manufacturer#1 | blush
rosy metallic lemon navajo | 2 | 902.0 | 1 |
LARGE BRUSHED BRASS | 2 |
{code}
was:
Repro steps
1) Have multiple json files in a directory having the same schema
2) Also have one or more empty files
Scenarios
1) Only one minor fragment{code}select * from dfs.`/json_dir`;{code}
{code}Result:
+----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
| row_key | p_partkey | p_name | p_mfgr
| p_brand | p_type | p_size | p_container |
p_retailprice | p_comment |
+----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
| 1 | 1 | goldenrod lace spring peru powder |
Manufacturer#1 | Brand#13 | PROMO BURNISHED COPPER | 7 | JUMBO PKG
| 901.0 | ly. slyly ironi |
| 2 | 2 | blush rosy metallic lemon navajo |
Manufacturer#1 | Brand#13 | LARGE BRUSHED BRASS | 1 | LG CASE
| 902.0 | lar accounts amo |
{code}
2) One minor fragment per file
{code}alter session set `planner.slice_target`=1;
select * from dfs.`/json_dir`;{code}
Result:
{code}
+-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
| p_brand | p_comment | p_container | p_mfgr |
p_name | p_partkey | p_retailprice | p_size |
p_type | row_key |
+-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
| Brand#13 | ly. slyly ironi | JUMBO PKG | Manufacturer#1 |
goldenrod lace spring peru powder | 1 | 901.0 | 7
| PROMO BURNISHED COPPER | 1 |
| Brand#13 | lar accounts amo | LG CASE | Manufacturer#1 | blush
rosy metallic lemon navajo | 2 | 902.0 | 1 |
LARGE BRUSHED BRASS | 2 |
| Brand#42 | egular deposits hag | WRAP CASE | Manufacturer#4 | dark
green antique puff wheat | 3 | 903.0 | 21 |
STANDARD POLISHED BRASS | 3 |
{code}
> Select * on directory containing multiple json files (one or more empty) with
> same schema doesn't preserve column order
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - JSON
> Affects Versions: 1.11.0
> Reporter: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Repro steps
> 1) Have multiple json files in a directory having the same schema
> 2) Also have one or more empty files
> Scenarios
> 1) Only one minor fragment{code}select * from dfs.`/json_dir`;{code}
> {code}Result:
> +----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
> | row_key | p_partkey | p_name |
> p_mfgr | p_brand | p_type | p_size | p_container
> | p_retailprice | p_comment |
> +----------+------------+------------------------------------------+-----------------+-----------+----------------------------+---------+--------------+----------------+------------------------+
> | 1 | 1 | goldenrod lace spring peru powder |
> Manufacturer#1 | Brand#13 | PROMO BURNISHED COPPER | 7 | JUMBO
> PKG | 901.0 | ly. slyly ironi |
> | 2 | 2 | blush rosy metallic lemon navajo |
> Manufacturer#1 | Brand#13 | LARGE BRUSHED BRASS | 1 | LG CASE
> | 902.0 | lar accounts amo |
> {code}
> 2) One minor fragment per file
> {code}alter session set `planner.slice_target`=1;
> select * from dfs.`/json_dir`;{code}
> Result:
> {code}
> +-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
> | p_brand | p_comment | p_container | p_mfgr |
> p_name | p_partkey | p_retailprice | p_size |
> p_type | row_key |
> +-----------+------------------------+--------------+-----------------+------------------------------------------+------------+----------------+---------+----------------------------+----------+
> | Brand#13 | ly. slyly ironi | JUMBO PKG | Manufacturer#1 |
> goldenrod lace spring peru powder | 1 | 901.0 | 7
> | PROMO BURNISHED COPPER | 1 |
> | Brand#13 | lar accounts amo | LG CASE | Manufacturer#1 | blush
> rosy metallic lemon navajo | 2 | 902.0 | 1 |
> LARGE BRUSHED BRASS | 2 |
> {code}
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)