[ 
https://issues.apache.org/jira/browse/HBASE-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-2461:
-------------------------

    Attachment: 2461-v7.txt

Posting this to review board.

Patch that keeps a journal during split transaction.  If split fails, call to 
rollback will restore the parent to original open condition by backing up 
whatever transaction steps completed.

The transaction spans split checks, closing of parent region and creation of 
daughters up to the addition of parent offlining to .META.  Once the .META. 
edit has been made, we cannot rollback -- we have to go forward.  This means 
that the basescanner fixup that will add missing daughter regions should the 
regionserver crash after parent region edit but before its added daughters is 
still required, in some form at least.

This patch includes a test of the new split code but only run against an 
HRegion, not in server context.  The split code is buried in heart of the 
regionserver and created on startup.  I stared at it for a while and injecting 
fault was just forbidding.  Its like bramble; there are so many spikes in the 
way of getting your finger down into the running split I ended up passing on it.

+ M src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
(split): Most of the split code has been moved out to the new SplitTransaction 
class.
Now this method prepares the split transaction, executes, and if failure does 
rollback.

+ M src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
(splitLock) Removed. Doesn't seem necessary.  Just  made close method 
synchronized.
(SPLITDIR) Moved to new SplitTransaction
Moved cleanup of half-done splits into SplitTransaction.  It'll know better how 
to do this.
Moved split code into SplitTransaction class.

+ M src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Made this class implement new OnlineRegions interface

+ A src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java
New Interface that allows you add/remove regions from oline regions.  This 
Interface
adds little.  Was just trying to make it so I didn't have to have server 
context doing
tests but in the end I just passed null for the case of no server context.  
Could remove
this.

+ A src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
New class that encapsulates all to do w/ splitting "transaction".

+ A src/main/java/org/apache/hadoop/hbase/util/PairOfSameType.java
Minor utility class

+M src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
(loadRegion) Added loading a region

+ M src/test/java/org/apache/hadoop/hbase/io/TestImmutableBytesWritable.java
(testSpecificCompare) Unrelated change

+ M src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
Change because of new manner in which splits are run.  Added a splitRegions 
method.

+ A src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
Test of region splitting code in region context.  Testing in server context 
would take
a bunch of work making it so could insert mock instance of SplitTransaction.








> Split doesn't handle IOExceptions when creating new region reference files
> --------------------------------------------------------------------------
>
>                 Key: HBASE-2461
>                 URL: https://issues.apache.org/jira/browse/HBASE-2461
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.0
>
>         Attachments: 2461-v2.txt, 2461-v3.txt, 2461-v4.txt, 2461-v6.txt, 
> 2461-v7.txt, 2461.txt
>
>
> I was testing an HDFS patch which had a bug in it, so it happened to throw an 
> NPE during a split with the following trace:
> 2010-04-16 19:18:20,727 ERROR 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction failed 
> for region TestTable,-1945465867<1271449232310>,1271453785648
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.enqueueCurrentPacket(DFSClient.java:3124)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3220)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3306)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3255)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>         at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:560)
>         at org.apache.hadoop.hbase.util.FSUtils.create(FSUtils.java:95)
>         at org.apache.hadoop.hbase.io.Reference.write(Reference.java:129)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.split(StoreFile.java:498)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:682)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:162)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:95)
> After that, my region was gone, any further writes to it would fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to