[ https://issues.apache.org/jira/browse/HIVE-24302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karen Coppage reassigned HIVE-24302: ------------------------------------ > Cleaner should not mark compaction queue entry as cleaned if it doesn't > remove obsolete files > --------------------------------------------------------------------------------------------- > > Key: HIVE-24302 > URL: https://issues.apache.org/jira/browse/HIVE-24302 > Project: Hive > Issue Type: Bug > Reporter: Karen Coppage > Assignee: Karen Coppage > Priority: Major > > Example: > # open txn 5, leave it open (maybe it's a long-running compaction) > # insert into table t in txns 6, 7 with writeids 1, 2 > # compactor.Worker runs on table t and compacts writeids 1, 2 > # compactor.Cleaner picks up the compaction queue entry, but doesn't delete > any files because the min global open txnid is 5, which cannot see writeIds > 1, 2. > # Cleaner marks the compactor queue entry as cleaned and removes the entry > from the queue. > delta_1 and delta_2 will remain in the file system until another compaction > is run on table t. > Step 5 should not happen, we should skip calling markCleaned() and leave it > in the queue in "ready to clean" state. MarkCleaned() should be called only > after txn 5 is closed and, following that, the cleaner runs successfully. > This will potentially slow down the cleaner, but on the other hand it won't > silently "fail" i.e. not do its job. -- This message was sent by Atlassian Jira (v8.3.4#803005)