I have not seen this before. You are failing because of java.lang.ArrayIndexOutOfBoundsException in org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:83). Tell us more about your context. Are you using compression? What kind of hardware, operating system (I'm trying to figure what is different about your setup that would bring on this AIOOFE)?
Thank you, St.Ack On Wed, May 11, 2011 at 6:30 AM, Chris Bohme <[email protected]> wrote: > Dear community, > > > > We are doing a test on a 5 node cluster with a table of about 50 million > rows (writes and reads). At some point we end up getting the following > exception on 2 of the region servers: > > > > 2011-05-11 14:18:28,660 INFO org.apache.hadoop.hbase.regionserver.Store: > Started compaction of 3 file(s) in cf=Family1 into > hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/.tmp, > seqid=66246, totalSize=64.2m > > 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store: > Compacting > hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/ > 7884224173883345569, keycount=790840, bloomtype=NONE, size=38.5m > > 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store: > Compacting > hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/ > 5160949580594728531, keycount=263370, bloomtype=NONE, size=12.8m > > 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store: > Compacting > hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/ > 7505588204602186903, keycount=263900, bloomtype=NONE, size=12.8m > > 2011-05-11 14:18:30,011 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Flush requested on > LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63 > e31a5e57602dc. > > 2011-05-11 14:18:30,011 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Started memstore flush for > LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63 > e31a5e57602dc., current region memstore size 64.2m > > 2011-05-11 14:18:30,011 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Finished snapshotting, commencing flushing stores > > 2011-05-11 14:18:31,067 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > serverName=eagle5.pinkmatter.local,60020,1305111886513, > load=(requests=20457, regions=11, usedHeap=934, maxHeap=4087): Replay of > HLog required. Forcing server shutdown > > org.apache.hadoop.hbase.DroppedSnapshotException: region: > LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63 > e31a5e57602dc. > > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java > :995) > > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java > :900) > > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:852) > > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlu > sher.java:392) > > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlu > sher.java:366) > > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.jav > a:240) > > Caused by: java.lang.ArrayIndexOutOfBoundsException > > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:83) > > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStre > am.java:49) > > at java.io.DataOutputStream.write(DataOutputStream.java:90) > > at java.io.DataOutputStream.write(DataOutputStream.java:90) > > at > org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:544) > > at > org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:501) > > at > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java: > 836) > > at > org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:479 > ) > > at > org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:448) > > at > org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:81) > > at > org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store > .java:1513) > > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java > :973) > > ... 5 more > > 2011-05-11 14:18:31,067 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=4233.9, regions=11, stores=22, storefiles=48, storefileIndexSize=8, > memstoreSize=483, compactionQueueSize=0, flushQueueSize=0, usedHeap=941, > maxHeap=4087, blockCacheSize=412883432, blockCacheFree=444366808, > blockCacheCount=6172, blockCacheHitCount=6181, blockCacheMissCount=556608, > blockCacheEvictedCount=0, blockCacheHitRatio=1, blockCacheHitCachingRatio=8 > > 2011-05-11 14:18:31,067 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Replay of HLog > required. Forcing server shutdown > > 2011-05-11 14:18:31,067 INFO > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: > regionserver60020.cacheFlusher exiting > > > > Hbase version is 0.90.2 and Hadoop version is compiled from > branch-0.20-append. > > > > Has anyone experienced something similar or has an idea where we can start > looking? > > > > Thanks! > > > > Chris > > > >
