[jira] [Created] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table

2012-03-28 Thread Cosmin Lehene (Created) (JIRA)
Repeated split causes HRegionServer failures and breaks table 
--

 Key: HBASE-5665
 URL: https://issues.apache.org/jira/browse/HBASE-5665
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.1, 0.92.0
Reporter: Cosmin Lehene
Priority: Blocker


Repeated splits on large tables (2 consecutive would suffice) will essentially 
break the table (and the cluster), unrecoverable.
The regionserver doing the split dies and the master will get into an infinite 
loop trying to assign regions that seem to have the files missing from HDFS.

The table can be disabled once. upon trying to re-enable it, it will remain in 
an intermediary state forever.

I was able to reproduce this on a smaller table consistently.

{code}
hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'}
hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}}
{code}

Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) will 
reproduce the issue almost instantly and consistently. 

{code}
2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in 
META
2012-03-28 10:57:16,321 DEBUG 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for 
t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1..  compaction_queue=(0:1), 
split_queue=10
2012-03-28 10:57:16,343 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: 
Running rollback/cleanup of failed split of 
t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; Failed 
ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
java.io.IOException: Failed 
ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363)
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451)
at 
org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException: File does not exist: 
/hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008)
at 
org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65)
at 
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467)
at 
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548)
at 
org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284)
at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229)
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504)
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484)
... 1 more
2012-03-28 10:57:16,345 FATAL 
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
ld2,60020,1332957343833: Abort; we got an error after point-of-no-return
{code}


http://hastebin.com/diqinibajo.avrasm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5656) LoadIncrementalHFiles should detect compression algorithm

2012-03-27 Thread Cosmin Lehene (Created) (JIRA)
LoadIncrementalHFiles should detect compression algorithm
-

 Key: HBASE-5656
 URL: https://issues.apache.org/jira/browse/HBASE-5656
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.92.1
Reporter: Cosmin Lehene
Assignee: Cosmin Lehene
 Fix For: 0.92.2, 0.94.0, 0.96.0


LoadIncrementalHFiles doesn't set compression when creating the the table.

This can be detected from the files within each family dir. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira