[ 
https://issues.apache.org/jira/browse/HBASE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881437#action_12881437
 ] 

HBase Review Board commented on HBASE-2729:
-------------------------------------------

Message from: "Todd Lipcon" <[email protected]>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/216/
-----------------------------------------------------------

(Updated 2010-06-22 15:55:16.514925)


Review request for hbase, stack and Ryan Rawson.


Changes
-------

TestMasterTransitions exposed a clunker of a bug here. In the previous 
iteration, I had put the tmp directory in /hbase/table/.tmp instead of 
/hbase/table/region/.tmp - I thought region.basedir was the region dir, not the 
table dir. So, when a new region was opened, it removed the tmp files being 
written by other region servers and made the world explode.

This iteration renames those variables to be more clear, and fixes the bug so 
the tmp dir is within the region, not the table.


Summary
-------

Fixes bugs where an exception in the middle of flushing a file leaves a 
half-written StoreFile in the region dir, preventing that region from 
recovering, or, in the case of transient errors, causing silent loss of half a 
file's worth of data.

I also got rid of the compaction dir here, and am just using one region-wide 
tmp dir. Is there some reason this is a bad idea?


This addresses bug HBASE-2729.
    http://issues.apache.org/jira/browse/HBASE-2729


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1794df8 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 04b7522 
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java dc38b3b 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 
9e5ca46 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java a65e947 

Diff: http://review.hbase.org/r/216/diff


Testing
-------

Ran TestCompaction and TestStore. Will start a cluster test running before I go 
to bed.


Thanks,

Todd




> flushCache should write to a tmp directory and then move into the store 
> directory
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-2729
>                 URL: https://issues.apache.org/jira/browse/HBASE-2729
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>
> Currently it appears that internalFlushCache writes directly to the target 
> spot of the flushed data. The finally() block appends the metadata and closes 
> the file as if nothing bad went wrong in case of an exception. This is really 
> bad, since it means that an IOE in the middle of flushing cache could easily 
> write a valid looking file with only half the data, which would then prevent 
> us from recovering those edits during log replay.
> Instead, it should flush to a tmp location and move it into the region dir 
> only after it's successfully written.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to