Csaba Ringhofer created HIVE-21931:
--------------------------------------

             Summary: Slow compaction for tiny tables
                 Key: HIVE-21931
                 URL: https://issues.apache.org/jira/browse/HIVE-21931
             Project: Hive
          Issue Type: Bug
    Affects Versions: 3.1.0
            Reporter: Csaba Ringhofer


I observed the issue in Impala development environment when (major) compacting 
insert_only transactional tables in Hive. The compaction could take ~10 minutes 
even when it only had to merge 2 rows from 2 inserts. The actual work was done 
much earlier, the new base file was correctly written to HDFS, and Hive seemed 
to wait without doing any work.

The compactions are started manually, hive.compactor.initiator.on=false to 
avoid "surprise compaction" during tests.

{code}
hive.compactor.abortedtxn.threshold=1000
hive.compactor.check.interval=300s
hive.compactor.cleaner.run.interval=5000ms
hive.compactor.compact.insert.only=true
hive.compactor.crud.query.based=false
hive.compactor.delta.num.threshold=10
hive.compactor.delta.pct.threshold=0.1
hive.compactor.history.reaper.interval=2m
hive.compactor.history.retention.attempted=2
hive.compactor.history.retention.failed=3
hive.compactor.history.retention.succeeded=3
hive.compactor.initiator.failed.compacts.threshold=2
hive.compactor.initiator.on=false
hive.compactor.max.num.delta=500
hive.compactor.worker.threads=4
hive.compactor.worker.timeout=86400s
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to