[ 
https://issues.apache.org/jira/browse/HBASE-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-2461:
-------------------------

    Attachment: 2461.txt

This issue highlights how exceptions post close of the region-to-be-split -- a 
necessary action if the split is to come out clean -- can poke a hole in an 
online table.

This patch starts down a road of treating the split operation inside in the 
regionserver as a 'transaction'.  There is a prepare step and an execute step.  
Should the execute fail -- execute step has stuff like close of region, update 
of meta table with new split codes -- then we'll call rollback.  The rollback 
will try and fixup the failed split by doing things like reopening region if 
appropriate and fixing up meta if necessary.

If the rollback fails, we'll kill the regionserver so that the processing of 
the server shutdown gets the effected regions back on line again.

Patch is not ready yet.

> Split doesn't handle IOExceptions when creating new region reference files
> --------------------------------------------------------------------------
>
>                 Key: HBASE-2461
>                 URL: https://issues.apache.org/jira/browse/HBASE-2461
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: 2461.txt
>
>
> I was testing an HDFS patch which had a bug in it, so it happened to throw an 
> NPE during a split with the following trace:
> 2010-04-16 19:18:20,727 ERROR 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction failed 
> for region TestTable,-1945465867<1271449232310>,1271453785648
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.enqueueCurrentPacket(DFSClient.java:3124)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3220)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3306)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3255)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>         at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:560)
>         at org.apache.hadoop.hbase.util.FSUtils.create(FSUtils.java:95)
>         at org.apache.hadoop.hbase.io.Reference.write(Reference.java:129)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.split(StoreFile.java:498)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:682)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:162)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:95)
> After that, my region was gone, any further writes to it would fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to