Stefán Baxter created DRILL-3533:
------------------------------------

             Summary: null values in a sub-structure in Parquet returns 
unexpected/misleading results
                 Key: DRILL-3533
                 URL: https://issues.apache.org/jira/browse/DRILL-3533
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.1.0
            Reporter: Stefán Baxter
            Assignee: Jinfeng Ni
            Priority: Critical


With this minimal dataset as /tmp/test.json:
{"dimensions":{"adults":"A"}}

select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) 
as `field2` from dfs.tmp.`/test.json` as p;

Returns this:
+---------+---------+
| field1  | field2  |
+---------+---------+
| null    | a       |
+---------+---------+

With the same data as a Parquet file
CREATE TABLE dfs.tmp.`/test` AS SELECT * FROM dfs.tmp.`/test.json`;

The same query:
select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) 
as `field2` from dfs.tmp.`/test/0_0_0.parquet` as p;

Return this:
+---------+---------+
| field1  | field2  |
+---------+---------+
| a       | null    |
+---------+---------+

After some more testing it appears that this has nothing to do with trim. (any 
non existing nested-value will be pushed aside)

select p.dimensions.budgetLevel as `field1`, lower(p.dimensions.adults) as 
`field2` from dfs.tmp.`/test/0_0_0.parquet` as p;

also returns:
+---------+---------+
| field1  | field2  |
+---------+---------+
| a       | null    |
+---------+---------+




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to