[
https://issues.apache.org/jira/browse/HBASE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881437#action_12881437
]
HBase Review Board commented on HBASE-2729:
-------------------------------------------
Message from: "Todd Lipcon" <[email protected]>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/216/
-----------------------------------------------------------
(Updated 2010-06-22 15:55:16.514925)
Review request for hbase, stack and Ryan Rawson.
Changes
-------
TestMasterTransitions exposed a clunker of a bug here. In the previous
iteration, I had put the tmp directory in /hbase/table/.tmp instead of
/hbase/table/region/.tmp - I thought region.basedir was the region dir, not the
table dir. So, when a new region was opened, it removed the tmp files being
written by other region servers and made the world explode.
This iteration renames those variables to be more clear, and fixes the bug so
the tmp dir is within the region, not the table.
Summary
-------
Fixes bugs where an exception in the middle of flushing a file leaves a
half-written StoreFile in the region dir, preventing that region from
recovering, or, in the case of transient errors, causing silent loss of half a
file's worth of data.
I also got rid of the compaction dir here, and am just using one region-wide
tmp dir. Is there some reason this is a bad idea?
This addresses bug HBASE-2729.
http://issues.apache.org/jira/browse/HBASE-2729
Diffs (updated)
-----
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1794df8
src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 04b7522
src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java dc38b3b
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
9e5ca46
src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java a65e947
Diff: http://review.hbase.org/r/216/diff
Testing
-------
Ran TestCompaction and TestStore. Will start a cluster test running before I go
to bed.
Thanks,
Todd
> flushCache should write to a tmp directory and then move into the store
> directory
> ---------------------------------------------------------------------------------
>
> Key: HBASE-2729
> URL: https://issues.apache.org/jira/browse/HBASE-2729
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.21.0
> Reporter: Todd Lipcon
> Assignee: stack
> Priority: Blocker
>
> Currently it appears that internalFlushCache writes directly to the target
> spot of the flushed data. The finally() block appends the metadata and closes
> the file as if nothing bad went wrong in case of an exception. This is really
> bad, since it means that an IOE in the middle of flushing cache could easily
> write a valid looking file with only half the data, which would then prevent
> us from recovering those edits during log replay.
> Instead, it should flush to a tmp location and move it into the region dir
> only after it's successfully written.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.