Edit file hdfs://eagle1:9000/hbase/LongTable/58e7c587ac3992ed20fc1a457a07ccd9/recovere d.edits/0000000000000063598 :
http://www.pinkmatter.com/download/hbase/mailinglist/0000000000000063598.tgz Master log: http://pastebin.com/64sCKQQD Thanks! Chris -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Stack Sent: 12 May 2011 05:20 PM To: [email protected] Subject: Re: ArrayIndexOutOfBoundsException in FSOutputSummer.write() On Thu, May 12, 2011 at 7:37 AM, Chris Bohme <[email protected]> wrote: > When manually browsing to the recovered.edits folder in HDFS and > opening them with HFile an error is shown: "Trailer header is wrong...." > They are not hfiles so yes, you'll see that (they are straight SequenceFiles IIRC). > If the edit files mean anything to you, we can post them as well. > Yes please. Can I see hdfs://eagle1:9000/hbase/LongTable/58e7c587ac3992ed20fc1a457a07ccd9/recovere d.edits/0000000000000063598 Any errors in the master log around the creation of the above? You can grep it in your master log. Thanks for the info, St.Ack > Thanks so far! > > Chris > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of > Stack > Sent: 11 May 2011 06:56 PM > To: [email protected] > Subject: Re: ArrayIndexOutOfBoundsException in FSOutputSummer.write() > > I have not seen this before. You are failing because of > java.lang.ArrayIndexOutOfBoundsException in > org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:83). > Tell us more about your context. Are you using compression? What > kind of hardware, operating system (I'm trying to figure what is > different about your setup that would bring on this AIOOFE)? > > Thank you, > St.Ack > > On Wed, May 11, 2011 at 6:30 AM, Chris Bohme <[email protected]> wrote: >> Dear community, >> >> >> >> We are doing a test on a 5 node cluster with a table of about 50 >> million rows (writes and reads). At some point we end up getting the >> following exception on 2 of the region servers: >> >> >> >> 2011-05-11 14:18:28,660 INFO org.apache.hadoop.hbase.regionserver.Store: >> Started compaction of 3 file(s) in cf=Family1 into >> hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/. >> tmp, >> seqid=66246, totalSize=64.2m >> >> 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store: >> Compacting >> > hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/ >> 7884224173883345569, keycount=790840, bloomtype=NONE, size=38.5m >> >> 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store: >> Compacting >> > hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/ >> 5160949580594728531, keycount=263370, bloomtype=NONE, size=12.8m >> >> 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store: >> Compacting >> > hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/ >> 7505588204602186903, keycount=263900, bloomtype=NONE, size=12.8m >> >> 2011-05-11 14:18:30,011 DEBUG > org.apache.hadoop.hbase.regionserver.HRegion: >> Flush requested on >> > LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63 >> e31a5e57602dc. >> >> 2011-05-11 14:18:30,011 DEBUG > org.apache.hadoop.hbase.regionserver.HRegion: >> Started memstore flush for >> > LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63 >> e31a5e57602dc., current region memstore size 64.2m >> >> 2011-05-11 14:18:30,011 DEBUG > org.apache.hadoop.hbase.regionserver.HRegion: >> Finished snapshotting, commencing flushing stores >> >> 2011-05-11 14:18:31,067 FATAL >> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server >> serverName=eagle5.pinkmatter.local,60020,1305111886513, >> load=(requests=20457, regions=11, usedHeap=934, maxHeap=4087): Replay >> of HLog required. Forcing server shutdown >> >> org.apache.hadoop.hbase.DroppedSnapshotException: region: >> > LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63 >> e31a5e57602dc. >> >> at >> > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java >> :995) >> >> at >> > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java >> :900) >> >> at >> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java: >> 852) >> >> at >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlu >> sher.java:392) >> >> at >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlu >> sher.java:366) >> >> at >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.jav >> a:240) >> >> Caused by: java.lang.ArrayIndexOutOfBoundsException >> >> at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:83) >> >> at >> > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStre >> am.java:49) >> >> at java.io.DataOutputStream.write(DataOutputStream.java:90) >> >> at java.io.DataOutputStream.write(DataOutputStream.java:90) >> >> at >> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:544) >> >> at >> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:501) >> >> at >> > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java: >> 836) >> >> at >> > org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:479 >> ) >> >> at >> org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:448) >> >> at >> org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:81) >> >> at >> > org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store >> .java:1513) >> >> at >> > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java >> :973) >> >> ... 5 more >> >> 2011-05-11 14:18:31,067 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: >> request=4233.9, regions=11, stores=22, storefiles=48, > storefileIndexSize=8, >> memstoreSize=483, compactionQueueSize=0, flushQueueSize=0, >> usedHeap=941, maxHeap=4087, blockCacheSize=412883432, >> blockCacheFree=444366808, blockCacheCount=6172, >> blockCacheHitCount=6181, blockCacheMissCount=556608, >> blockCacheEvictedCount=0, blockCacheHitRatio=1, > blockCacheHitCachingRatio=8 >> >> 2011-05-11 14:18:31,067 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Replay >> of > HLog >> required. Forcing server shutdown >> >> 2011-05-11 14:18:31,067 INFO >> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: >> regionserver60020.cacheFlusher exiting >> >> >> >> Hbase version is 0.90.2 and Hadoop version is compiled from >> branch-0.20-append. >> >> >> >> Has anyone experienced something similar or has an idea where we can start >> looking? >> >> >> >> Thanks! >> >> >> >> Chris >> >> >> >> > >
