[ 
https://issues.apache.org/jira/browse/HBASE-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-2231:
---------------------------------

    Attachment: hbase-2231_v5.patch

I have rebased Stack's v4 patch, and made some changes. 

>From my testing, it seems that IO Fencing for the WAL works (tested with 
>HBASE-7878, not HBASE-8389). [~stack] do you remember how did you test and 
>conclude that it does not work? This is pretty important, so I can dedicate 
>some more time on testing it. 

Some of the changes from patch v4 to v5 are: 
 - Renamed Compaction -> CompactionDescriptor
 - Compaction output can be a list of files, not a single file. 
 - Added compaction WAL edit replay. Now we recognize and replay the compaction 
edit on region open. This is needed because if we fail after we snyc() the wal, 
we might still try to delete the files. 
 - Added 2 more tests for above condition. 
 - HLog does not know about compaction wal edit. 
 - Compaction wal edit log goes through the normal append code path. 

I've tried to capture the Fencing / Idempotency semantics in the following 
javadoc excerpt: 
Compaction event should be idempotent, since there is no IO Fencing for
    the region directory in hdfs. A region server might still try to complete 
the
    compaction after it lost the region. That is why the following events are 
carefully
    ordered for a compaction:
     1. Compaction writes new files under region/.tmp directory (compaction 
output)
     2. Compaction atomically moves the temporary file under region directory
     3. Compaction appends a WAL edit containing the compaction input and 
output files.
     Forces sync on WAL.
     4. Compaction deletes the input files from the region directory.
   
    Failure conditions are handled like this:
     - If RS fails before 2, compaction wont complete. Even if RS lives on and 
finishes
     the compaction later, it will only write the new data file to the region 
directory.
     Since we already have this data, this will be idempotent but we will have 
a redundant
     copy of the data.
     - If RS fails between 2 and 3, the region will have a redundant copy of 
the data. The
     RS that failed won't be able to finish snyc() for WAL because of lease 
recovery in WAL.
     - If RS fails after 3, the region region server who opens the region will 
pick up the
     the compaction marker from the WAL and replay it by removing the 
compaction input files.
     Failed RS can also attempt to delete those files, but the operation will 
be idempotent
   





                
> Compaction events should be written to HLog
> -------------------------------------------
>
>                 Key: HBASE-2231
>                 URL: https://issues.apache.org/jira/browse/HBASE-2231
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>              Labels: moved_from_0_20_5
>             Fix For: 0.95.1
>
>         Attachments: 2231-testcase-0.94.txt, 2231-testcase_v2.txt, 
> 2231-testcase_v3.txt, 2231v2.txt, 2231v3.txt, 2231v4.txt, 
> hbase-2231-testcase.txt, hbase-2231.txt, hbase-2231_v5.patch
>
>
> The sequence for a compaction should look like this:
> # Compact region to "new" files
> # Write a "Compacted Region" entry to the HLog
> # Delete "old" files
> This deals with a case where the RS has paused between step 1 and 2 and the 
> regions have since been reassigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to