"java.io.IOException: Could not obtain block": allthough file is there and 
accessible through the dfs client
------------------------------------------------------------------------------------------------------------

                 Key: HBASE-1078
                 URL: https://issues.apache.org/jira/browse/HBASE-1078
             Project: Hadoop HBase
          Issue Type: Bug
         Environment: hadoop 0.19.0
hbase 0.19.0-dev, r728134
            Reporter: Thibaut


Hi,
after doing some more stress testing, my cluster did just stopped working. The 
regionserver reponsible for the ROOT region can't read a block related to the 
root region, but it is definitely there as I can read the file through the dfs 
client.

All new clients fail to start:

java.io.IOException: java.io.IOException: Could not obtain block: 
blk_-3504243288385983835_18732 
file=/hbase/-ROOT-/70236052/info/mapfiles/780254459775584115/data
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at 
org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1895)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1925)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1830)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
        at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:517)
        at 
org.apache.hadoop.hbase.regionserver.HStore.rowAtOrBeforeFromMapFile(HStore.java:1709)
        at 
org.apache.hadoop.hbase.regionserver.HStore.getRowKeyAtOrBefore(HStore.java:1681)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1072)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1466)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
        at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:894)

        at sun.reflect.GeneratedConstructorAccessor13.newInstance(Unknown 
Source)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at 
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:95)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:550)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:450)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:422)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:559)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:454)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:415)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:113)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:96)


Clients that are still connected do still work. (As they have probably cached 
the ROOT region?)

This seemed to have happend after one of the region servers (number 3) shut 
itselfdown due to exceptions (EOFileException, Unable to create new block, 
etc... see logfile) The ROOT region then probably got moved to region server 2.

I attached the logs (DEBUG enabled) of the hdfs namenode, the hbase master node 
and the two log files of regionserver 2 and 3. I can provide any



The filesystem is in healthy state. I can also download the file through the 
hadoop fs command without any problem and without getting an error message 
about missing blocks.

Status: HEALTHY
 Total size:    142881532319 B (Total open files size: 12415139840 B)
 Total dirs:    4153
 Total files:   3541 (Files currently being written: 106)
 Total blocks (validated):      5208 (avg. block size 27435010 B) (Total open 
file blocks (not validated): 205)
 Minimally replicated blocks:   5208 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    4
 Average block replication:     4.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          7
 Number of racks:               1
The filesystem under path '/' is HEALTHY


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to