On Thu, May 12, 2011 at 7:37 AM, Chris Bohme <[email protected]> wrote:
> When manually browsing to the recovered.edits folder in HDFS and opening
> them with HFile an error is shown: "Trailer header is wrong...."
>

They are not hfiles so yes, you'll see that (they are straight
SequenceFiles IIRC).

> If the edit files mean anything to you, we can post them as well.
>

Yes please.  Can I see
hdfs://eagle1:9000/hbase/LongTable/58e7c587ac3992ed20fc1a457a07ccd9/recovere
d.edits/0000000000000063598

Any errors in the master log around the creation of the above?  You
can grep it in your master log.

Thanks for the info,
St.Ack

> Thanks so far!
>
> Chris
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Stack
> Sent: 11 May 2011 06:56 PM
> To: [email protected]
> Subject: Re: ArrayIndexOutOfBoundsException in FSOutputSummer.write()
>
> I have not seen this before.  You are failing because of
> java.lang.ArrayIndexOutOfBoundsException in
> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:83).
> Tell us more about your context.  Are you using compression?  What
> kind of hardware, operating system (I'm trying to figure what is
> different about your setup that would bring on this AIOOFE)?
>
> Thank you,
> St.Ack
>
> On Wed, May 11, 2011 at 6:30 AM, Chris Bohme <[email protected]> wrote:
>> Dear community,
>>
>>
>>
>> We are doing a test on a 5 node cluster with a table of about 50 million
>> rows (writes and reads). At some point we end up getting the following
>> exception on 2 of the region servers:
>>
>>
>>
>> 2011-05-11 14:18:28,660 INFO org.apache.hadoop.hbase.regionserver.Store:
>> Started compaction of 3 file(s) in cf=Family1  into
>> hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/.tmp,
>> seqid=66246, totalSize=64.2m
>>
>> 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> Compacting
>>
> hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/
>> 7884224173883345569, keycount=790840, bloomtype=NONE, size=38.5m
>>
>> 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> Compacting
>>
> hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/
>> 5160949580594728531, keycount=263370, bloomtype=NONE, size=12.8m
>>
>> 2011-05-11 14:18:28,661 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> Compacting
>>
> hdfs://eagle1:9000/hbase/LongTable/167e7b292cc45b9face9a9cb7d86384c/Family1/
>> 7505588204602186903, keycount=263900, bloomtype=NONE, size=12.8m
>>
>> 2011-05-11 14:18:30,011 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion:
>> Flush requested on
>>
> LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63
>> e31a5e57602dc.
>>
>> 2011-05-11 14:18:30,011 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion:
>> Started memstore flush for
>>
> LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63
>> e31a5e57602dc., current region memstore size 64.2m
>>
>> 2011-05-11 14:18:30,011 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion:
>> Finished snapshotting, commencing flushing stores
>>
>> 2011-05-11 14:18:31,067 FATAL
>> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
>> serverName=eagle5.pinkmatter.local,60020,1305111886513,
>> load=(requests=20457, regions=11, usedHeap=934, maxHeap=4087): Replay of
>> HLog required. Forcing server shutdown
>>
>> org.apache.hadoop.hbase.DroppedSnapshotException: region:
>>
> LongTable,\x00\x00\x00\x00\x01\xC9\xD5\x13,1305115816217.20a05ebff2597ae6a63
>> e31a5e57602dc.
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java
>> :995)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java
>> :900)
>>
>>       at
>> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:852)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlu
>> sher.java:392)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlu
>> sher.java:366)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.jav
>> a:240)
>>
>> Caused by: java.lang.ArrayIndexOutOfBoundsException
>>
>>       at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:83)
>>
>>       at
>>
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStre
>> am.java:49)
>>
>>       at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>
>>       at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>
>>       at
>> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:544)
>>
>>       at
>> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:501)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:
>> 836)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:479
>> )
>>
>>       at
>> org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:448)
>>
>>       at
>> org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:81)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store
>> .java:1513)
>>
>>       at
>>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java
>> :973)
>>
>>       ... 5 more
>>
>> 2011-05-11 14:18:31,067 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
>> request=4233.9, regions=11, stores=22, storefiles=48,
> storefileIndexSize=8,
>> memstoreSize=483, compactionQueueSize=0, flushQueueSize=0, usedHeap=941,
>> maxHeap=4087, blockCacheSize=412883432, blockCacheFree=444366808,
>> blockCacheCount=6172, blockCacheHitCount=6181, blockCacheMissCount=556608,
>> blockCacheEvictedCount=0, blockCacheHitRatio=1,
> blockCacheHitCachingRatio=8
>>
>> 2011-05-11 14:18:31,067 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Replay of
> HLog
>> required. Forcing server shutdown
>>
>> 2011-05-11 14:18:31,067 INFO
>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
>> regionserver60020.cacheFlusher exiting
>>
>>
>>
>> Hbase version is 0.90.2 and Hadoop version is compiled from
>> branch-0.20-append.
>>
>>
>>
>> Has anyone experienced something similar or has an idea where we can start
>> looking?
>>
>>
>>
>> Thanks!
>>
>>
>>
>> Chris
>>
>>
>>
>>
>
>

Reply via email to