[jira] [Updated] (DRILL-6171) Querying avro files returns null for not null values

Engin Sozer (JIRA) Tue, 20 Feb 2018 06:35:47 -0800

     [ 
https://issues.apache.org/jira/browse/DRILL-6171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Engin Sozer updated DRILL-6171:
-------------------------------
    Description: 
Querying an avro file results in incorrect results. (with jdbc/odbc drivers 
from MapR but I believe that is the case with standard drivers from drill as 
well) For example:

We have a file (test.avro') with columns col1, col2, col3 and col4 for a total 
of 5000 rows. In the first 4000 rows, col3 is null. In the last 1000 rows, col3 
is not null. The issue is that when we write;
{code:java}
select * from dfs.tmp.`test.avro`;
select col1, col2, col3 from dfs.tmp.`test.avro`;
{code}
col3 is returned null for all 5000 rows. If we write:
{code:java}
select col1, col2, COALESCE(col3, null) from dfs.tmp.`test.avro`;{code}
This returns the correct results. (col3 = null for first 4000 rows and col3= 
not null for the next 1000)

  was:
Querying an avro file with a select * statement results in incorrect results. 
For example:

We have a file (test.avro') with columns col1, col2, col3 and col4 for a total 
of 5000 rows. In the first 4000 rows, col3 is null. In the last 1000 rows, col3 
is not null. The issue is that when we write;

 
{code:java}
select * from dfs.tmp.`test.avro`;
select col1, col2, col3 from dfs.tmp.`test.avro`;
{code}
col3 is returned null for all 5000 rows. If we write:
{code:java}
select col1, col2, COALESCE(col3, null) from dfs.tmp.`test.avro`;{code}
This returns the correct results. (col3 = null for first 4000 rows and col3= 
not null for the next 1000)

    Component/s: Client - JDBC

> Querying avro files returns null for not null values
> ----------------------------------------------------
>
>                 Key: DRILL-6171
>                 URL: https://issues.apache.org/jira/browse/DRILL-6171
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Client - JDBC, Storage - Avro
>    Affects Versions: 1.10.0
>            Reporter: Engin Sozer
>            Priority: Major
>
> Querying an avro file results in incorrect results. (with jdbc/odbc drivers 
> from MapR but I believe that is the case with standard drivers from drill as 
> well) For example:
> We have a file (test.avro') with columns col1, col2, col3 and col4 for a 
> total of 5000 rows. In the first 4000 rows, col3 is null. In the last 1000 
> rows, col3 is not null. The issue is that when we write;
> {code:java}
> select * from dfs.tmp.`test.avro`;
> select col1, col2, col3 from dfs.tmp.`test.avro`;
> {code}
> col3 is returned null for all 5000 rows. If we write:
> {code:java}
> select col1, col2, COALESCE(col3, null) from dfs.tmp.`test.avro`;{code}
> This returns the correct results. (col3 = null for first 4000 rows and col3= 
> not null for the next 1000)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6171) Querying avro files returns null for not null values

Reply via email to