[ 
https://issues.apache.org/jira/browse/HBASE-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777081#comment-13777081
 ] 

Sergey Shelukhin commented on HBASE-9648:
-----------------------------------------

If you look at the code now (at least in 94 the archeological record survives) 
it has provisions for creating the writer for compaction output lazily (even a 
comment about it iirc), so that if there's no data, there's no file; but then 
below it goes and says, we need to create writer anyway because blah-blah see 
that jira.
So when the original problem on the thread picks one file to remove, this code 
says oh no I need to create the empty file anyway, and creates it; after which 
the expired-file-selection code picks the new file, as a single file, to 
remove, and it creates another empty file, forever. Expired file compaction is 
supposed to be fast so it pre-empts any real compaction.
What I'm saying is that this empty-file-creation might not always be necessary, 
judging by that jira. All we want to avoid is losing the latest-used seqNum in 
the store, which means we can have no-output compaction as long as we are not 
dropping the file with the last seqNum. So the original problem from the thread 
will go away if that was done, deletion of some random expired file would just 
nuke it. It will also be generally useful to not create these blank files as 
often.
The only edge case remaining is when this file is the file with the last seqNum 
(I incorrectly said above that it is when it's the only file), in which case 
expired file thing should not pick it.
                
> collection one expired storefile causes it to be replaced by another expired 
> storefile
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-9648
>                 URL: https://issues.apache.org/jira/browse/HBASE-9648
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Sergey Shelukhin
>            Assignee: Jean-Marc Spaggiari
>         Attachments: HBASE-9648-v0-0.94.patch, HBASE-9648-v0-trunk.patch
>
>
> There's a shortcut in compaction selection that causes the selection of 
> expired store files to quickly delete.
> However, there's also the code that ensures we write at least one file to 
> preserve seqnum. This new empty file is "expired", because it has no data, 
> presumably.
> So it's collected again, etc.
> This affects 94, probably also 96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to