[ 
https://issues.apache.org/jira/browse/DRILL-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637178#comment-14637178
 ] 

ASF GitHub Bot commented on DRILL-3533:
---------------------------------------

GitHub user jinfengni opened a pull request:

    https://github.com/apache/drill/pull/97

    DRILL-3533: Fix incorrect query result when querying parquet for fiel…

    …ds that do not exists.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jinfengni/incubator-drill DRILL-3533

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/97.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #97
    
----
commit 6bd5de9993d30e06341038c971a02287ddf06827
Author: Jinfeng Ni <[email protected]>
Date:   2015-07-22T05:00:40Z

    DRILL-3533: Fix incorrect query result when querying parquet for fields 
that do not exists.

----


> null values in a sub-structure in Parquet returns unexpected/misleading 
> results
> -------------------------------------------------------------------------------
>
>                 Key: DRILL-3533
>                 URL: https://issues.apache.org/jira/browse/DRILL-3533
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.1.0
>            Reporter: Stefán Baxter
>            Assignee: Jinfeng Ni
>            Priority: Critical
>
> With this minimal dataset as /tmp/test.json:
> {"dimensions":{"adults":"A"}}
> select lower(p.dimensions.budgetLevel) as `field1`, 
> lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test.json` as p;
> Returns this:
> +---------+---------+
> | field1  | field2  |
> +---------+---------+
> | null    | a       |
> +---------+---------+
> With the same data as a Parquet file
> CREATE TABLE dfs.tmp.`/test` AS SELECT * FROM dfs.tmp.`/test.json`;
> The same query:
> select lower(p.dimensions.budgetLevel) as `field1`, 
> lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet` as 
> p;
> Return this:
> +---------+---------+
> | field1  | field2  |
> +---------+---------+
> | a       | null    |
> +---------+---------+
> After some more testing it appears that this has nothing to do with trim. 
> (any non existing nested-value will be pushed aside)
> select p.dimensions.budgetLevel as `field1`, lower(p.dimensions.adults) as 
> `field2` from dfs.tmp.`/test/0_0_0.parquet` as p;
> also returns:
> +---------+---------+
> | field1  | field2  |
> +---------+---------+
> | a       | null    |
> +---------+---------+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to