Csaba Ringhofer created HIVE-21931: -------------------------------------- Summary: Slow compaction for tiny tables Key: HIVE-21931 URL: https://issues.apache.org/jira/browse/HIVE-21931 Project: Hive Issue Type: Bug Affects Versions: 3.1.0 Reporter: Csaba Ringhofer
I observed the issue in Impala development environment when (major) compacting insert_only transactional tables in Hive. The compaction could take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The actual work was done much earlier, the new base file was correctly written to HDFS, and Hive seemed to wait without doing any work. The compactions are started manually, hive.compactor.initiator.on=false to avoid "surprise compaction" during tests. {code} hive.compactor.abortedtxn.threshold=1000 hive.compactor.check.interval=300s hive.compactor.cleaner.run.interval=5000ms hive.compactor.compact.insert.only=true hive.compactor.crud.query.based=false hive.compactor.delta.num.threshold=10 hive.compactor.delta.pct.threshold=0.1 hive.compactor.history.reaper.interval=2m hive.compactor.history.retention.attempted=2 hive.compactor.history.retention.failed=3 hive.compactor.history.retention.succeeded=3 hive.compactor.initiator.failed.compacts.threshold=2 hive.compactor.initiator.on=false hive.compactor.max.num.delta=500 hive.compactor.worker.threads=4 hive.compactor.worker.timeout=86400s {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)