[
https://issues.apache.org/jira/browse/HBASE-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-2461:
-------------------------
Attachment: 2461-v7.txt
Posting this to review board.
Patch that keeps a journal during split transaction. If split fails, call to
rollback will restore the parent to original open condition by backing up
whatever transaction steps completed.
The transaction spans split checks, closing of parent region and creation of
daughters up to the addition of parent offlining to .META. Once the .META.
edit has been made, we cannot rollback -- we have to go forward. This means
that the basescanner fixup that will add missing daughter regions should the
regionserver crash after parent region edit but before its added daughters is
still required, in some form at least.
This patch includes a test of the new split code but only run against an
HRegion, not in server context. The split code is buried in heart of the
regionserver and created on startup. I stared at it for a while and injecting
fault was just forbidding. Its like bramble; there are so many spikes in the
way of getting your finger down into the running split I ended up passing on it.
+ M src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
(split): Most of the split code has been moved out to the new SplitTransaction
class.
Now this method prepares the split transaction, executes, and if failure does
rollback.
+ M src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
(splitLock) Removed. Doesn't seem necessary. Just made close method
synchronized.
(SPLITDIR) Moved to new SplitTransaction
Moved cleanup of half-done splits into SplitTransaction. It'll know better how
to do this.
Moved split code into SplitTransaction class.
+ M src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Made this class implement new OnlineRegions interface
+ A src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java
New Interface that allows you add/remove regions from oline regions. This
Interface
adds little. Was just trying to make it so I didn't have to have server
context doing
tests but in the end I just passed null for the case of no server context.
Could remove
this.
+ A src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
New class that encapsulates all to do w/ splitting "transaction".
+ A src/main/java/org/apache/hadoop/hbase/util/PairOfSameType.java
Minor utility class
+M src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
(loadRegion) Added loading a region
+ M src/test/java/org/apache/hadoop/hbase/io/TestImmutableBytesWritable.java
(testSpecificCompare) Unrelated change
+ M src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
Change because of new manner in which splits are run. Added a splitRegions
method.
+ A src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
Test of region splitting code in region context. Testing in server context
would take
a bunch of work making it so could insert mock instance of SplitTransaction.
> Split doesn't handle IOExceptions when creating new region reference files
> --------------------------------------------------------------------------
>
> Key: HBASE-2461
> URL: https://issues.apache.org/jira/browse/HBASE-2461
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: Todd Lipcon
> Assignee: stack
> Priority: Blocker
> Fix For: 0.90.0
>
> Attachments: 2461-v2.txt, 2461-v3.txt, 2461-v4.txt, 2461-v6.txt,
> 2461-v7.txt, 2461.txt
>
>
> I was testing an HDFS patch which had a bug in it, so it happened to throw an
> NPE during a split with the following trace:
> 2010-04-16 19:18:20,727 ERROR
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction failed
> for region TestTable,-1945465867<1271449232310>,1271453785648
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.enqueueCurrentPacket(DFSClient.java:3124)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3220)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3306)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3255)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
> at
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
> at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:560)
> at org.apache.hadoop.hbase.util.FSUtils.create(FSUtils.java:95)
> at org.apache.hadoop.hbase.io.Reference.write(Reference.java:129)
> at
> org.apache.hadoop.hbase.regionserver.StoreFile.split(StoreFile.java:498)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:682)
> at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:162)
> at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:95)
> After that, my region was gone, any further writes to it would fail.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.