I found this in regionserver log on one machine - the region server shutdown
shortly after:

2010-03-05 23:44:37,859 WARN  [DataStreamer for file /hbase/.logs/
snv-it-lin-010.projectrialto.com,60020,1267695848448/hlog.dat.1267860383622]
hdfs.DFSClient$DFSOutputStream(2589): Error Recovery for block
blk_6820048136829787576_478281 failed  because recovery from primary
datanode 10.10.31.135:50010 failed 5 times.  Pipeline was 10.10.31.135:50010.
Will retry...
2010-03-05 23:44:38,865 WARN  [DataStreamer for file /hbase/.logs/
snv-it-lin-010.projectrialto.com,60020,1267695848448/hlog.dat.1267860383622]
hdfs.DFSClient$DFSOutputStream(2583): Error Recovery for block
blk_6820048136829787576_478281 failed  because recovery from primary
datanode 10.10.31.135:50010 failed 6 times.  Pipeline was 10.10.31.135:50010.
Aborting...
2010-03-05 23:44:38,866 ERROR [regionserver/10.10.31.135:60020]
regionserver.HRegionServer(631): Unable to close log in abort
java.io.IOException: Error Recovery for block blk_6820048136829787576_478281
failed  because recovery from primary datanode 10.10.31.135:50010 failed 6
times.  Pipeline was 10.10.31.135:50010. Aborting...
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2584)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2078)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2241)
2010-03-05 23:44:38,866 DEBUG [regionserver/10.10.31.135:60020]
regionserver.HRegionServer(1669): closing region
ignoreTable,com.india-forums.www/forum_posts.asp?TID=1274472,1266968531214
2010-03-05 23:44:38,866 DEBUG [regionserver/10.10.31.135:60020]
regionserver.HRegion(453): Closing
ignoreTable,com.india-forums.www\x2Fforum_posts.asp\x3FTID\x3D1274472,1266968531214:
compactions & flushes disabled
2010-03-05 23:44:38,867 DEBUG [regionserver/10.10.31.135:60020]
regionserver.HRegion(470): Updates disabled for region, no outstanding
scanners on
ignoreTable,com.india-forums.www\x2Fforum_posts.asp\x3FTID\x3D1274472,1266968531214

Here is result from 'fsck /hbase':
...
/hbase/domaincrawltable/116384076/txt/4886186747089330505:  Under replicated
blk_7285175333095642722_478442. Target Replicas is 3 but found 2 replica(s).
.......................................................
.....................................................................................Status:
HEALTHY
 Total size:    13749366171 B
 Total dirs:    275
 Total files:   285 (Files currently being written: 2)
 Total blocks (validated):      417 (avg. block size 32972101 B)
 Minimally replicated blocks:   417 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       11 (2.6378896 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.9736211
 Corrupt blocks:                0
 Missing replicas:              11 (0.88709676 %)
 Number of data-nodes:          3
 Number of racks:               1

If you can shed some light on how this might happen or resolution method,
that would be great.

Here is excerpt of log from datanode 10.10.31.135:

2010-03-05 23:41:52,333 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
10.10.31.135:50010, dest: /10.10.31.135:43906, bytes: 5011, op: HDFS_READ,
cliID: DFSClient_-854338598, srvID:
DS-1802582900-10.10.30.104-50010-1249540398456, blockid:
blk_-8220999320966573627_478276
2010-03-05 23:44:26,112 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action:
DNA_REGISTER
2010-03-05 23:44:26,795 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock
for block blk_6820048136829787576_478281
java.nio.channels.ClosedByInterruptException
2010-03-05 23:44:26,795 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_6820048136829787576_478281 received exception java.io.IOException:
Interrupted receiveBlock
2010-03-05 23:44:26,795 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.10.31.135:50010,
storageID=DS-1802582900-10.10.30.104-50010-1249540398456, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Interrupted receiveBlock
        at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:569)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
        at java.lang.Thread.run(Thread.java:619)
2010-03-05 23:44:26,796 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
blk_6820048136829787576_478281 1 Exception java.net.SocketException: Socket
closed
        at java.net.SocketInputStream.read(SocketInputStream.java:162)
        at java.io.DataInputStream.readFully(DataInputStream.java:178)
        at java.io.DataInputStream.readLong(DataInputStream.java:399)
        at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:853)
        at java.lang.Thread.run(Thread.java:619)

2010-03-05 23:44:26,796 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
blk_6820048136829787576_478281 1 : Thread is interrupted.
2010-03-05 23:44:26,796 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block
blk_6820048136829787576_478281 terminating
2010-03-05 23:44:26,797 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
blk_6820048136829787576_478416 of size 19767296 as part of lease recovery.
2010-03-05 23:44:28,054 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_-197353992341078477_457049
2010-03-05 23:44:30,798 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 80261 blocks
got processed in 4000 msecs
2010-03-05 23:44:32,814 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-8837772549203757719_478317 src: /10.10.31.137:40420 dest: /
10.10.31.135:50010
2010-03-05 23:44:32,815 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-9195341056633376217_478409 src: /10.10.31.136:38576 dest: /
10.10.31.135:50010
2010-03-05 23:44:32,814 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-8837772549203757719_478317 src: /10.10.31.137:40420 dest: /
10.10.31.135:50010
2010-03-05 23:44:32,815 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-9195341056633376217_478409 src: /10.10.31.136:38576 dest: /
10.10.31.135:50010
2010-03-05 23:44:32,816 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
blk_-9195341056633376217_478409 src: /10.10.31.136:38576 dest: /
10.10.31.135:50010 of size 387
2010-03-05 23:44:32,817 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
blk_-8837772549203757719_478317 src: /10.10.31.137:40420 dest: /
10.10.31.135:50010 of size 25429
2010-03-05 23:44:33,807 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_-8961344666669238364_478288 file
/disk3/opt/kindsight/hadoop/data/dfs/data/current/subdir22/subdir59/blk_-8961344666669238364
2010-03-05 23:44:33,807 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_-6720040806211326547_478305 file
/disk4/opt/kindsight/hadoop/data/dfs/data/current/subdir56/subdir5/blk_-6720040806211326547
2010-03-05 23:44:33,825 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_-977571326835424573_478298 file
/opt/kindsight/hadoop/data/dfs/data/current/subdir61/subdir26/blk_-977571326835424573
2010-03-05 23:44:33,825 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_-878425890325495654_478304 file
/disk3/opt/kindsight/hadoop/data/dfs/data/current/subdir22/subdir47/blk_-878425890325495654
2010-03-05 23:44:33,826 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_656787642109994178_478290 file
/opt/kindsight/hadoop/data/dfs/data/current/subdir61/subdir26/blk_656787642109994178
2010-03-05 23:44:33,826 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_1073497695673238763_478300 file
/disk3/opt/kindsight/hadoop/data/dfs/data/current/subdir22/subdir47/blk_1073497695673238763
2010-03-05 23:44:33,827 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls
recoverBlock(block=blk_6820048136829787576_478281, targets=[
10.10.31.135:50010])
2010-03-05 23:44:33,829 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 13 on 50020, call recoverBlock(blk_6820048136829787576_478281,
false, [Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@44355f02) from
10.10.31.135:52441: error: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: blk_6820048136829787576_478281 is already commited,
storedBlock == null.
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:4676)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:473)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

org.apache.hadoop.ipc.RemoteException: java.io.IOException:
blk_6820048136829787576_478281 is already commited, storedBlock == null.
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:4676)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:473)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy0.nextGenerationStamp(Unknown Source)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1550)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1524)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1590)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2010-03-05 23:44:33,831 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_1345490507338325022_478299 file
/disk2/opt/kindsight/hadoop/data/dfs/data/current/subdir5/subdir57/blk_1345490507338325022
2010-03-05 23:44:33,843 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
blk_7963383895657483589_478296 file
/disk3/opt/kindsight/hadoop/data/dfs/data/current/subdir22/subdir17/blk_7963383895657483589
2010-03-05 23:44:34,835 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls
recoverBlock(block=blk_6820048136829787576_478281, targets=[
10.10.31.135:50010])
2010-03-05 23:44:34,837 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 15 on 50020, call recoverBlock(blk_6820048136829787576_478281,
false, [Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@25bd85b5) from
10.10.31.135:52441: error: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: Block (=blk_6820048136829787576_478281) not found
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:1897)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.commitBlockSynchronization(NameNode.java:481)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block
(=blk_6820048136829787576_478281) not found
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:1897)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.commitBlockSynchronization(NameNode.java:481)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy0.commitBlockSynchronization(Unknown Source)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1543)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1524)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1590)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2010-03-05 23:44:35,781 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-5453101949124244660_478413 src: /10.10.31.136:38577 dest: /
10.10.31.135:50010
2010-03-05 23:44:35,782 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
blk_-5453101949124244660_478413 src: /10.10.31.136:38577 dest: /
10.10.31.135:50010 of size 4133
2010-03-05 23:44:35,783 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-5148454955563266670_478361 src: /10.10.31.136:38578 dest: /
10.10.31.135:50010
2010-03-05 23:44:35,784 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
blk_-5148454955563266670_478361 src: /10.10.31.136:38578 dest: /
10.10.31.135:50010 of size 121
2010-03-05 23:44:35,846 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls
recoverBlock(block=blk_6820048136829787576_478281, targets=[
10.10.31.135:50010])
2010-03-05 23:44:35,847 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 17 on 50020, call recoverBlock(blk_6820048136829787576_478281,
false, [Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@423c489e) from
10.10.31.135:52441: error: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: Block (=blk_6820048136829787576_478281) not found

Reply via email to