[
https://issues.apache.org/jira/browse/HIVE-13377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216620#comment-15216620
]
Gabriel C Balan commented on HIVE-13377:
----------------------------------------
Gently pinging [~Ferd], [~dongc], [~spena], [~ashutoshc].
> Lost rows when using compact index on parquet table
> ---------------------------------------------------
>
> Key: HIVE-13377
> URL: https://issues.apache.org/jira/browse/HIVE-13377
> Project: Hive
> Issue Type: Bug
> Components: Indexing
> Affects Versions: 1.1.0
> Environment: linux, cdh 5.5.0
> Reporter: Gabriel C Balan
> Priority: Minor
>
> Query with where clause on a parquet table loses rows when using a compact
> index. The query produces the right results without the index.
> {code}
> create table small_parq(i int) stored as parquet;
> insert into table small_parq values (1), (2), (3), (4), (5), (6), (7), (8),
> (9), (10), (11);
> set hive.optimize.index.filter=true;
> set hive.optimize.index.filter.compact.minsize=50;
> create index comp_idx on table small_parq (i) as 'compact' WITH DEFERRED
> REBUILD;
> alter index comp_idx on small_parq rebuild;
> select * from small_parq where i=3;
> --this correctly produces 1 row (value 3).
> select * from small_parq where i=11;
> --this incorrectly produces 0 rows.
> --I see correct results when looking for a row in [1,6];
> --I see bad results when looking for a row in [7,11].
> --All is well once I disable the compact index
> set hive.optimize.index.filter.compact.minsize=50000000;
> select * from small_parq where i=11;
> --now it correctly produces 1 row (value 11).
> {code}
> It seems I can't reproduce this issue if the base table was ORC, SEQ, AVRO,
> TEXTFILE.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)