[
https://issues.apache.org/jira/browse/HIVE-24162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-24162:
----------------------------------
Labels: pull-request-available (was: )
> Query based compaction looses bloom filter
> ------------------------------------------
>
> Key: HIVE-24162
> URL: https://issues.apache.org/jira/browse/HIVE-24162
> Project: Hive
> Issue Type: Bug
> Reporter: Peter Varga
> Assignee: Peter Varga
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> *Steps to reproduce:*
>
> {noformat}
> +----------------------------------------------------+
> | createtab_stmt |
> +----------------------------------------------------+
> | CREATE TABLE `bloomTest`( |
> | `msisdn` string, |
> | `imsi` varchar(20), |
> | `imei` bigint, |
> | `cell_id` bigint) |
> | ROW FORMAT SERDE |
> | 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' |
> | STORED AS INPUTFORMAT |
> | 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' |
> | OUTPUTFORMAT |
> | 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
> | LOCATION |
> |
> 's3a://dwxtpcds30-wwgq-dwx-managed/clusters/env-6cwwgq/warehouse-1580338415-7dph/warehouse/tablespace/managed/hive/del_db.db/bloomtest'
> |
> | TBLPROPERTIES ( |
> | 'bucketing_version'='2', |
> | 'orc.bloom.filter.columns'='msisdn,cell_id,imsi', |
> | 'orc.bloom.filter.fpp'='0.02', |
> | 'transactional'='true', |
> | 'transactional_properties'='default', |
> | 'transient_lastDdlTime'='1597222946') |
> +----------------------------------------------------+
> insert into bloomTest values ("a", "b", 10, 20);
> insert into bloomTest values ("aa", "bb", 100, 200);
> insert into bloomTest values ("aaa", "bbb", 1000, 2000);
> select * from bloomTest;
> +-------------------+-----------------+-----------------+--------------------+
> | bloomtest.msisdn | bloomtest.imsi | bloomtest.imei | bloomtest.cell_id |
> +-------------------+-----------------+-----------------+--------------------+
> | a | b | 10 | 20 |
> | aa | bb | 100 | 200 |
> | aaa | bbb | 1000 | 2000 |
> +-------------------+-----------------+-----------------+--------------------+
> {noformat}
> - Compact the table
> {code:java}
> alter table bloomTest compact 'MAJOR';
> {code}
> - Wait for the compaction to be over and check for bloom filters in dataset.
>
> - delta would have it, but not in the base dataset.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)