[
https://issues.apache.org/jira/browse/IMPALA-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760780#comment-16760780
]
Tim Armstrong commented on IMPALA-283:
--------------------------------------
Here's another example from [~alanj_impala_5a78]:
We have an HBase-backed impala table, which has a string column (for the
purpose of this jira, {{sCol}})
There are records where that column is null, which we can observe with queries
like {{select * from table where sCol is null limit 1}}
However, when we run these commands, we get bad results:
{code:sql}
-- Returns 0
select count(*) from table where sCol is null;
-- Returns only rows for string values (we only have a few options in this
case), no row for null
select sCol, count(*) from table group by sCol
{code}
These commands work as expected on parquet-backed tables. They also do not work
in Hive, where I will file a jira shortly.
> select count(*) produces inconsistent results for hbase tables because of
> NULL projection
> -----------------------------------------------------------------------------------------
>
> Key: IMPALA-283
> URL: https://issues.apache.org/jira/browse/IMPALA-283
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 0.7
> Reporter: Ishaan Joshi
> Priority: Minor
>
> When count(*) is clubbed with another aggregation columns, it produces
> different results.
> {code}
> [localhost:21000] > select count(*) from functional.alltypesagg;
> Query: select count(*) from functional.alltypesagg
> Query finished, fetching results ...
> +----------+
> | count(*) |
> +----------+
> | 10000 |
> +----------+
> Returned 1 row(s) in 0.26s
> [localhost:21000] > select count(*) from functional_hbase.alltypesagg;
> Query: select count(*) from functional_hbase.alltypesagg
> Query finished, fetching results ...
> +----------+
> | count(*) |
> +----------+
> | 10000 |
> +----------+
> Returned 1 row(s) in 0.65s
> [localhost:21000] > select count(*), count(int_col) from
> functional_hbase.alltypesagg;
> Query: select count(*), count(int_col) from functional_hbase.alltypesagg
> Query finished, fetching results ...
> +----------+----------------+
> | count(*) | count(int_col) |
> +----------+----------------+
> | 9990 | 9990 |
> +----------+----------------+
> Returned 1 row(s) in 0.91s
> [localhost:21000] > select count(*), count(int_col) from
> functional.alltypesagg;
> Query: select count(*), count(int_col) from functional.alltypesagg
> Query finished, fetching results ...
> +----------+----------------+
> | count(*) | count(int_col) |
> +----------+----------------+
> | 10000 | 9990 |
> +----------+----------------+
> Returned 1 row(s) in 0.26s
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]