[ 
https://issues.apache.org/jira/browse/HBASE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881104#action_12881104
 ] 

HBase Review Board commented on HBASE-2729:
-------------------------------------------

Message from: "Todd Lipcon" <[email protected]>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/216/
-----------------------------------------------------------

Review request for hbase, stack and Ryan Rawson.


Summary
-------

Fixes bugs where an exception in the middle of flushing a file leaves a 
half-written StoreFile in the region dir, preventing that region from 
recovering, or, in the case of transient errors, causing silent loss of half a 
file's worth of data.

I also got rid of the compaction dir here, and am just using one region-wide 
tmp dir. Is there some reason this is a bad idea?


This addresses bug HBASE-2729.
    http://issues.apache.org/jira/browse/HBASE-2729


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1794df8 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 04b7522 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 
9e5ca46 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java a65e947 

Diff: http://review.hbase.org/r/216/diff


Testing
-------

Ran TestCompaction and TestStore. Will start a cluster test running before I go 
to bed.


Thanks,

Todd




> flushCache should write to a tmp directory and then move into the store 
> directory
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-2729
>                 URL: https://issues.apache.org/jira/browse/HBASE-2729
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>
> Currently it appears that internalFlushCache writes directly to the target 
> spot of the flushed data. The finally() block appends the metadata and closes 
> the file as if nothing bad went wrong in case of an exception. This is really 
> bad, since it means that an IOE in the middle of flushing cache could easily 
> write a valid looking file with only half the data, which would then prevent 
> us from recovering those edits during log replay.
> Instead, it should flush to a tmp location and move it into the region dir 
> only after it's successfully written.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to