[ 
https://issues.apache.org/jira/browse/IMPALA-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760780#comment-16760780
 ] 

Tim Armstrong commented on IMPALA-283:
--------------------------------------

Here's another example from [~alanj_impala_5a78]:

We have an HBase-backed impala table, which has a string column (for the 
purpose of this jira, {{sCol}})

There are records where that column is null, which we can observe with queries 
like {{select * from table where sCol is null limit 1}}

However, when we run these commands, we get bad results:
{code:sql}
-- Returns 0
select count(*) from table where sCol is null;
-- Returns only rows for string values (we only have a few options in this 
case), no row for null
select sCol, count(*) from table group by sCol
{code}

These commands work as expected on parquet-backed tables. They also do not work 
in Hive, where I will file a jira shortly.
 

> select count(*) produces inconsistent results for hbase tables because of 
> NULL projection
> -----------------------------------------------------------------------------------------
>
>                 Key: IMPALA-283
>                 URL: https://issues.apache.org/jira/browse/IMPALA-283
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 0.7
>            Reporter: Ishaan Joshi
>            Priority: Minor
>
> When count(*) is clubbed with another aggregation columns, it produces 
> different results.
> {code}
> [localhost:21000] > select count(*) from functional.alltypesagg;
> Query: select count(*) from functional.alltypesagg
> Query finished, fetching results ...
> +----------+
> | count(*) |
> +----------+
> | 10000    |
> +----------+
> Returned 1 row(s) in 0.26s
> [localhost:21000] > select count(*) from functional_hbase.alltypesagg;
> Query: select count(*) from functional_hbase.alltypesagg
> Query finished, fetching results ...
> +----------+
> | count(*) |
> +----------+
> | 10000    |
> +----------+
> Returned 1 row(s) in 0.65s
> [localhost:21000] > select count(*), count(int_col) from 
> functional_hbase.alltypesagg;
> Query: select count(*), count(int_col) from functional_hbase.alltypesagg
> Query finished, fetching results ...
> +----------+----------------+
> | count(*) | count(int_col) |
> +----------+----------------+
> | 9990     | 9990           |
> +----------+----------------+
> Returned 1 row(s) in 0.91s
> [localhost:21000] > select count(*), count(int_col) from 
> functional.alltypesagg;
> Query: select count(*), count(int_col) from functional.alltypesagg
> Query finished, fetching results ...
> +----------+----------------+
> | count(*) | count(int_col) |
> +----------+----------------+
> | 10000    | 9990           |
> +----------+----------------+
> Returned 1 row(s) in 0.26s
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to