Hi Prasad, 0.90.6 is a pretty old HBase version, and so CDH3u5 is a pretty old CDH version...
Any chance to move to a more recent version? JM 2013/8/8 Prasad GS <[email protected]> > Hi, > > We are using Cloudera CDH3u5 distribution of HBase (0.90.6). The RS goes > down suddenly & from the logs we see the following exception in the region > server : > > 2013-08-07 20:36:58,008 INFO org.apache.hadoop.hbase.regionserver.Store: > Completed compaction of 18 file(s), new file=hdfs:// > > 192.168.0.29:9000/hbase/UsageHistoryMA/1f50c6795c7753315f1fbc04946753d1/d/3311452476716076182 > , > size=320.2m; total size for store is 320.2m > 2013-08-07 20:36:58,008 INFO org.apache.hadoop.hbase.regionserver.HRegion: > completed compaction on region > UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1. after 1mins, > 51sec > 2013-08-07 20:36:58,009 INFO > org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of > region UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1. > 2013-08-07 20:36:58,010 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Closing UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1.: disabling > compactions & flushes > 2013-08-07 20:36:58,010 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Updates disabled for region UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1. > 2013-08-07 20:36:58,010 DEBUG org.apache.hadoop.hbase.regionserver.Store: > closed d > 2013-08-07 20:36:58,010 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Closed UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1. > 2013-08-07 20:36:58,029 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Instantiated UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375900618008.13150e07893adb4eded6d4dc98374e9e. > 2013-08-07 20:36:58,031 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Instantiated UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00 > \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8. > 2013-08-07 20:36:58,038 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Offlined parent region UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1. in META > 2013-08-07 20:36:58,085 DEBUG org.apache.hadoop.hbase.regionserver.Store: > loaded hdfs:// > > 192.168.0.29:9000/hbase/UsageHistoryMA/6e9d9b93a9509909ed5c4d9e2bd321a8/d/3311452476716076182.1f50c6795c7753315f1fbc04946753d1 > , > isReference=true, isBulkLoadResult=false, seqid=26966370, > majorCompaction=false > 2013-08-07 20:36:58,087 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Onlined UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00 > \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8.; next > sequenceid=26966371 > 2013-08-07 20:36:58,087 DEBUG > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction > requested for UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00 > \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8. because > Region has references on open; priority=99, compaction queue size=18 > 2013-08-07 20:36:58,092 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added daughter UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00 > \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8. in region > .META.,,1, serverInfo=dl360x2807,60020,1374636004119 > 2013-08-07 20:36:58,093 INFO > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running > rollback/cleanup of failed split of > UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00 > \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1.; Failed > > dl360x2807,60020,1374636004119-daughterOpener=13150e07893adb4eded6d4dc98374e9e > > java.io.IOException: Failed > > dl360x2807,60020,1374636004119-daughterOpener=13150e07893adb4eded6d4dc98374e9e > > at > > org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:307) > > at > > org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:205) > > at > > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:135) > > Caused by: java.util.ConcurrentModificationException > at java.util.SubList.checkForComodification(AbstractList.java:752) > at java.util.SubList.size(AbstractList.java:625) > at java.util.AbstractList.add(AbstractList.java:91) > at > > org.apache.hadoop.hbase.monitoring.TaskMonitor.createStatus(TaskMonitor.java:75) > > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:346) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2860) > at > > org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:383) > > at > > org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:352) > > 2013-08-07 20:36:58,112 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > serverName=dl360x2807,60020,1374636004119, load=(requests=91, regions=170, > usedHeap=7213, maxHeap=32730): Abort; we got an error after > point-of-no-return > 2013-08-07 20:36:58,113 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > requests=30, regions=170, stores=171, storefiles=167, > storefileIndexSize=134, memstoreSize=187, mbInMemoryWithoutWAL=0, > numberOfPutsWithoutWAL=0, compactionQueueSize=17, flushQueueSize=0, > usedHeap=6992, maxHeap=32730, blockCacheSize=3028798008, > blockCacheFree=7267346888, blockCacheCount=51548, > blockCacheHitCount=55248138, blockCacheMissCount=3593839, > blockCacheEvictedCount=0, blockCacheHitRatio=93, > blockCacheHitCachingRatio=99 > 2013-08-07 20:36:58,119 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Abort; we got > an error after point-of-no-return > 2013-08-07 20:36:58,119 INFO > org.apache.hadoop.hbase.regionserver.CompactSplitThread: > regionserver60020.compactor exiting > 2013-08-07 20:36:59,161 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 60020 > > Could someone pls let me know as to why the region split failed & why the > RS went down. According to me, the ConcurrentModificationException looks > really trivial. > > > Regards, > Prasad >
