2010/2/4 Michał Podsiadłowski <[email protected]>:
> Hi all,
> I wrote yesterday evening (of my time :)) about missing file and today i did
> a restart of whole hbase and it looks like problem disappeared. According to
> my taste it looks like either client or region server "forgets" about table
> split and still tries to retrieve data from old region while there are
> already 2 new daughters. My misconfiguration or some bug - maybe some
> threading issue. This is not the first time I've seen this. 2 days ago I've
> ended up testing on the same error but then I thought that it was due to
> datanodes having problems with persisting files due to disks out of space.
> This time there was plenty of space on all nodes.

Next time, when you see something like this:

>> java.io.IOException: java.io.IOException: Cannot open filename
>> /hbase/filmContributors/1670715971/content/3783592739034234831

...try getting it with a new client as in:

$ ./bin/hadoop fs -get
/hbase/filmContributors/1670715971/content/3783592739034234831 .

Does it work?  If so, then the dfsclient hbase is using has 'rotted'.
This is usually one or a combination of ulimit not > default or
xceivers not > default or you (unlikely) do not have a patched hadoop
in your hbase CLASSPATH (hbase needs hdfs-127... the hadoop that is in
the hbase/lib dir has a hadoop with this patch applied).


The below is really bad usually indicative of a stressed hdfs (or one
not configered for the load its taking on):

> IOException: Could not complete write to file

I tried to follow your pastebin link but it is empty for me.  It works for you?
St.Ack

>
> Thanks,
> Michal
>
> W dniu 3 lutego 2010 17:14 użytkownik Michał Podsiadłowski <
> [email protected]> napisał:
>
>> Hi,
>> it's me again having problem - hope this is not another misconfiguration
>> problem ( or maybe it would be better it it was one).
>> After loading some moderate amount of data - around 3GB some rows are not
>> available due to strange exceptions
>>
>> java.io.IOException: java.io.IOException: Cannot open filename
>> /hbase/filmContributors/1670715971/content/3783592739034234831
>>
>> When trying to scan the table regions server pukes like this
>>
>> 2010-02-03 16:03:39,060 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 7 on 60020, call next(2813423168169765496, 1) from
>> 10.0.100.50:41364: error: java.io.IOException: java.lang.RuntimeException:
>> java.io.IOException: Cannot open filename
>> /hbase/filmContributors/1670715971/content/3783592739034234831
>> java.io.IOException: java.lang.RuntimeException: java.io.IOException:
>> Cannot open filename
>> /hbase/filmContributors/1670715971/content/3783592739034234831
>>     at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:872)
>>     at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:862)
>>     at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1918)
>>     at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>>     at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
>>     at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> Caused by: java.lang.RuntimeException: java.io.IOException: Cannot open
>> filename /hbase/filmContributors/1670715971/content/3783592739034234831
>>     at
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:61)
>>     at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:79)
>>     at
>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:164)
>>     at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:106)
>>     at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1807)
>>     at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1771)
>>     at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1894)
>>     ... 5 more
>> Caused by: java.io.IOException: Cannot open filename
>> /hbase/filmContributors/1670715971/content/3783592739034234831
>>     at
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1474)
>>     at
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1800)
>>     at
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1616)
>>     at
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
>>     at java.io.DataInputStream.read(DataInputStream.java:132)
>>     at
>> org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:99)
>>     at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100)
>>     at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1020)
>>     at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:971)
>>     at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1163)
>>     at
>> org.apache.hadoop.hbase.io.HalfHFileReader$1.next(HalfHFileReader.java:125)
>>     at
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>>     ... 11 more
>>
>>
>> greping regionserver log for dir name *1670715971* shows this
>> 2010-02-03 15:32:37,082 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/314440477/content/3783592739034234831.1670715971,
>> isReference=true, sequence id=7541774, length=33390929,
>> majorCompaction=false
>> 2010-02-03 15:32:37,088 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/314440477/content/6518523095287027530.1670715971,
>> isReference=true, sequence id=7542003, length=7890, majorCompaction=false
>> 2010-02-03 15:32:37,095 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/314440477/description/2305635563712489918.1670715971,
>> isReference=true, sequence id=7542003, length=2256, majorCompaction=false
>> 2010-02-03 15:32:37,101 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/314440477/description/6970032752270852156.1670715971,
>> isReference=true, sequence id=7541774, length=6664268, majorCompaction=false
>> 2010-02-03 15:32:37,129 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/1836766931/content/3783592739034234831.1670715971,
>> isReference=true, sequence id=7541773, length=33390929,
>> majorCompaction=false
>> 2010-02-03 15:32:37,152 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/1836766931/content/6518523095287027530.1670715971,
>> isReference=true, sequence id=7542002, length=7890, majorCompaction=false
>> 2010-02-03 15:32:37,165 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/1836766931/description/2305635563712489918.1670715971,
>> isReference=true, sequence id=7542002, length=2256, majorCompaction=false
>> 2010-02-03 15:32:37,170 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> loaded
>> /hbase/filmContributors/1836766931/description/6970032752270852156.1670715971,
>> isReference=true, sequence id=7541773, length=6664268, majorCompaction=false
>> 2010-02-03 15:33:49,943 WARN org.apache.hadoop.hdfs.DFSClient: DFS Read:
>> java.io.IOException: Cannot open filename
>> /hbase/filmContributors/1670715971/content/3783592739034234831
>> and many many times java.io.IOException: Cannot open filename
>> /hbase/filmContributors/*1670715971*/content/3783592739034234831
>>
>> *on different one I found this *
>>
>> 2010-02-03 15:32:35,512 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
>> Cleaned up /hbase/filmContributors/*1670715971*/splits true
>> 2010-02-03 15:32:35,515 INFO
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META
>> updated, and report to master all successful. Old region=REGION => {NAME =>
>> 'filmContributor
>> s,,1265203126633', STARTKEY => '', ENDKEY => '31587', ENCODED => *
>> 1670715971*, OFFLINE => true, SPLIT => true, TABLE => {{NAME =>
>> 'filmContributors', MAX_FILESIZE => '268435456', FAMILIES => [{NAME =
>> > 'content', COMPRESSION => 'NONE', VERSIONS => '1', TTL => '2147483647',
>> BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
>> 'description', COMPRESSION => 'NONE', VERSIONS
>> => '1', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
>> BLOCKCACHE => 'true'}]}}, new regions: filmContributors,,1265207555247,
>> filmContributors,117416,1265207555247. Split took 0s
>> ec
>> **
>> more details here - http://pastebin.com/d7c52f27a
>>
>> Also sometimes namenode logs i can see messages like this:
>>
>> 2010-02-03 15:32:38,416 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 3 on 54310, call
>> complete(/hbase/filmContributors/compaction.dir/1836766931/2633146516707160051,
>> DFSClient_-902184734)
>> from 10.0.100.51:49692: error: java.io.IOException: Could not complete
>> write to file
>> /hbase/filmContributors/compaction.dir/1836766931/2633146516707160051 by
>> DFSClient_-902184734
>> java.io.IOException: Could not complete write to file
>> /hbase/filmContributors/compaction.dir/1836766931/2633146516707160051 by
>> DFSClient_-902184734
>>
>>
>> Please help.
>>
>> Cheers,
>> Michal
>>
>

Reply via email to