Whats happening on "bad datanode 192.168.0.175:50010" at the time?  Its
struggling badly.  Appends are taking > 10seconds.
St.Ack

On Tue, Jun 2, 2009 at 1:00 PM, llpind <[email protected]> wrote:

>
> Here are the changes I've made.
> - Moved to Gigabit ethernet
> - upgraded to HBase 0.19.3
> - Run upload program from master box (instead of local)
> - Added 5 more nodes (Now w/8 region servers and 1 master.  same boxes hold
> hadoop datanodes).
>
>
> The upload got much further (45M+ of 58M), but I still lost one region
> server between 45M-46M.
> here is the HBase log from that region server:
> =====================================================
>        2009-06-02 11:21:24,334 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: starting  compaction on
> region
> tableA,row_keyA,1243964035709
> 2009-06-02 11:21:24,336 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> 1008565632/entity: no store files to compact
> 2009-06-02 11:21:24,346 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Compaction size of 1008565632/link: 361.0m; Skipped 2 file(s), size:
> 337039946
> 2009-06-02 11:21:24,392 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Closed
>
> hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966872067,
> entries=115001. New log writer:
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966884387
> 2009-06-02 11:21:24,392 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
> Last sequence written is empty. Deleting all old hlogs
> 2009-06-02 11:21:24,392 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old log file
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966854582 whose
> highest sequence/edit id is 26714173
> 2009-06-02 11:21:24,395 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old log file
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966859798 whose
> highest sequence/edit id is 26819173
> 2009-06-02 11:21:24,401 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old log file
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966867599 whose
> highest sequence/edit id is 26924173
> 2009-06-02 11:21:24,404 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old log file
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966872067 whose
> highest sequence/edit id is 27039174
> 2009-06-02 11:21:24,412 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Started compaction of 2 file(s)  into
> /hbase/tableA/compaction.dir/1008565632/link/mapfiles/1956913417104407709
> 2009-06-02 11:21:37,145 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exception: java.net.SocketTimeoutException: 10000 millis timeout while
> waiting for channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/192.168.0.175:36139
> remote=/192.168.0.175:50010]
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209)
>
> 2009-06-02 11:21:37,146 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 1 on 60020 took 10003ms appending an edit to HLog;
> editcount=41936
> 2009-06-02 11:21:37,146 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7404574899445503675_2363 bad datanode[0]
> 192.168.0.175:50010
> 2009-06-02 11:21:37,147 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7404574899445503675_2363 in pipeline
> 192.168.0.175:50010, 192.168.0.194:50010: bad datanode 192.168.0.175:50010
> 2009-06-02 11:21:37,405 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exception: java.net.SocketTimeoutException: 10000 millis timeout while
> waiting for channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/192.168.0.175:36141
> remote=/192.168.0.175:50010]
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209)
>
> 2009-06-02 11:21:37,406 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-1937654560341865886_2364 bad datanode[0]
> 192.168.0.175:50010
> 2009-06-02 11:21:37,406 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-1937654560341865886_2364 in pipeline
> 192.168.0.175:50010, 192.168.0.195:50010: bad datanode 192.168.0.175:50010
> 2009-06-02 11:21:40,754 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 1 on 60020 took 3608ms appending an edit to HLog;
> editcount=41940
> 2009-06-02 11:21:40,806 WARN org.apache.hadoop.hbase.regionserver.HLog:
> regionserver/0.0.0.0:60020.logFlusher took 11285ms optional sync to HLog;
> editcount=45000
> 2009-06-02 11:21:44,692 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Closed
>
> hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966878324,
> entries=105000. New log writer:
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966904686
> 2009-06-02 11:21:44,692 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
> Found 0 logs to remove  out of total 1; oldest outstanding seqnum is
> 27039175 from region tableA,row_keyA,1243964035709
> 2009-06-02 11:21:47,133 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> moving
> /hbase/tableA/compaction.dir/1008565632/link/mapfiles/1956913417104407709
> to
> /hbase/tableA/1008565632/link/mapfiles/8666639134401806411
> 2009-06-02 11:21:47,228 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Completed  compaction of 1008565632/link store size is 360.5m
> 2009-06-02 11:21:47,244 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region tableA,row_keyA,1243964035709 in 22sec
> 2009-06-02 11:21:47,276 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Startkey and midkey are same, not splitting
>
>                2009-06-02 11:21:47,133 DEBUG
> org.apache.hadoop.hbase.regionserver.HStore:
> moving
> /hbase/tableA/compaction.dir/1008565632/link/mapfiles/1956913417104407709
> to
> /hbase/tableA/1008565632/link/mapfiles/8666639134401806411
> 2009-06-02 11:21:47,228 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Completed  compaction of 1008565632/link store size is 360.5m
> 2009-06-02 11:21:47,244 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region tableA,row_keyA,1243964035709 in 22sec
> 2009-06-02 11:21:47,276 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Startkey and midkey are same, not splitting
> 2009-06-02 11:21:50,460 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Closed
>
> hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966884387,
> entries=105000. New log writer:
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966910456
> 2009-06-02 11:21:50,460 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
> Found 0 logs to remove  out of total 2; oldest outstanding seqnum is
> 27039175 from region tableA,row_keyA,1243964035709
> 2009-06-02 11:21:54,857 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Flush requested on tableA,row_keyA,1243964035709
> 2009-06-02 11:21:54,857 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Started memcache flush for region tableA,row_keyA,1243964035709. Current
> region memcache size 64.5m
> 2009-06-02 11:22:05,786 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exception: java.net.SocketTimeoutException: 10000 millis timeout while
> waiting for channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/192.168.0.175:36162
> remote=/192.168.0.175:50010]
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209)
>
> 2009-06-02 11:22:05,787 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7847038903223219269_2371 bad datanode[0]
> 192.168.0.175:50010
> 2009-06-02 11:22:05,787 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7847038903223219269_2371 in pipeline
> 192.168.0.175:50010, 192.168.0.179:50010: bad datanode 192.168.0.175:50010
> 2009-06-02 11:22:12,025 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Blocking updates for 'IPC Server handler 3 on 60020' on region
> tableA,row_keyA,1243964035709: Memcache size 128.2m is >= than blocking
> 128.0m size
> 2009-06-02 11:22:52,772 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7847038903223219269_2371 failed  because recovery
> from primary datanode 192.168.0.179:50010 failed 1 times. Will retry...
> 2009-06-02 11:22:52,776 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7847038903223219269_2371 bad datanode[0]
> 192.168.0.175:50010
> 2009-06-02 11:22:52,776 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7847038903223219269_2371 in pipeline
> 192.168.0.175:50010, 192.168.0.179:50010: bad datanode 192.168.0.175:50010
> 2009-06-02 11:22:58,311 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exception: java.net.SocketTimeoutException: 5000 millis timeout while
> waiting for channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/192.168.0.175:58191
> remote=/192.168.0.179:50010]
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>        at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209)
>
> 2009-06-02 11:22:58,311 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_7847038903223219269_2373 bad datanode[0]
> 192.168.0.179:50010
> 2009-06-02 11:22:58,351 FATAL
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay of hlog
> required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region:
> tableA,row_keyA,1243964035709
>        at
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:903)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:796)
>        at
>
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.flushRegion(MemcacheFlusher.java:265)
>        at
>
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.run(MemcacheFlusher.java:148)
> Caused by: java.io.IOException: All datanodes 192.168.0.179:50010 are bad.
> Aborting...
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2444)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1996)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160)
>
>                at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1996)
>        at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160)
> 2009-06-02 11:22:58,354 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0.0, regions=1, stores=2, storefiles=3, storefileIndexSize=0,
> memcacheSize=128, usedHeap=172, maxHeap=2999
> 2009-06-02 11:22:58,355 INFO
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher:
> regionserver/0.0.0.0:60020.cacheFlusher exiting
> 2009-06-02 11:22:58,370 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Closed
>
> hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966904686,
> entries=440000. New log writer:
> /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966978364
> 2009-06-02 11:22:58,370 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
> Found 0 logs to remove  out of total 3; oldest outstanding seqnum is
> 27039175 from region tableA,row_keyA,1243964035709
> 2009-06-02 11:22:58,370 INFO
> org.apache.hadoop.hbase.regionserver.LogRoller:
> LogRoller exiting.
> 2009-06-02 11:23:00,839 INFO
> org.apache.hadoop.hbase.regionserver.LogFlusher:
> regionserver/0.0.0.0:60020.logFlusher exiting
> 2009-06-02 11:23:01,347 DEBUG org.apache.hadoop.hbase.RegionHistorian:
> Offlined
> 2009-06-02 11:23:01,347 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60020
> 2009-06-02 11:23:01,348 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Flush requested on tableA,row_keyA,1243964035709
> 2009-06-02 11:23:01,348 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
> 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC
> Server listener on 60020
> 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60020: exiting
> 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60020: exiting
> 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60020: exiting
> 2009-06-02 11:23:01,348 INFO org.mortbay.util.ThreadedServer: Stopping
> Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
> 2009-06-02 11:23:01,350 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC
> Server Responder
> 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60020: exiting
> 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60020: exiting
> 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60020: exiting
> 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 60020: exiting
> 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 60020: exiting
> 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60020: exiting
> 2009-06-02 11:23:01,355 INFO org.mortbay.http.SocketListener: Stopped
> SocketListener on 0.0.0.0:60030
> 2009-06-02 11:23:02,531 INFO org.mortbay.util.Container: Stopped
> HttpContext[/logs,/logs]
> 2009-06-02 11:23:02,532 INFO org.mortbay.util.Container: Stopped
> org.mortbay.jetty.servlet.webapplicationhand...@79884a40
> 2009-06-02 11:23:03,294 INFO org.mortbay.util.Container: Stopped
> WebApplicationContext[/static,/static]
> 2009-06-02 11:23:03,295 INFO org.mortbay.util.Container: Stopped
> org.mortbay.jetty.servlet.webapplicationhand...@794e113b
> 2009-06-02 11:23:04,438 INFO org.mortbay.util.Container: Stopped
> WebApplicationContext[/,/]
> 2009-06-02 11:23:04,438 INFO org.mortbay.util.Container: Stopped
> org.mortbay.jetty.ser...@4e4ee70b
> 2009-06-02 11:23:04,438 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
> closing log writer in
> hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020
> 2009-06-02 11:23:04,439 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
> regionserver/0.0.0.0:60020.compactor exiting
> 2009-06-02 11:23:04,439 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker:
> regionserver/0.0.0.0:60020.majorCompactionChecker exiting
> 2009-06-02 11:23:04,814 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed hlog
> 2009-06-02 11:23:04,815 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegionServer: closing region
> tableA,row_keyA,1243964035709
> 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Closing tableA,row_keyA,1243964035709: compactions & flushes disabled
> 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Updates disabled for region, no outstanding scanners on
> tableA,row_keyA,1243964035709
> 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> No more row locks outstanding on region tableA,row_keyA,1243964035709
> 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> closed 1008565632/entity
> 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> closed 1008565632/link
> 2009-06-02 11:23:04,815 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed tableA,row_keyA,1243964035709
> 2009-06-02 11:23:05,362 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at:
> 192.168.0.175:60020
> 2009-06-02 11:23:07,463 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
> 2009-06-02 11:23:07,567 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver/0.0.0.0:60020 exiting
> 2009-06-02 11:23:08,511 INFO org.apache.hadoop.hbase.Leases:
> regionserver/0.0.0.0:60020.leaseChecker closing leases
> 2009-06-02 11:23:08,511 INFO org.apache.hadoop.hbase.Leases:
> regionserver/0.0.0.0:60020.leaseChecker closed leases
> 2009-06-02 11:23:08,512 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown
> thread.
> 2009-06-02 11:23:08,513 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread
> complete
>
> =====================================================
>
>
>
> Here is the log from the datanode:
>
>
> =====================================================
>                2009-06-02 12:14:27,084 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:59337, dest: /192.168.0.175:50010, bytes: 9, op:
> HDFS_WRITE,
> cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-1486458067537042098_2495
> 2009-06-02 12:14:27,085 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-1486458067537042098_2495 terminating
> 2009-06-02 12:14:34,702 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-428154950044527208_2466 file
> /data/hadoop-datastore/current/blk_-428154950044527208
> 2009-06-02 12:14:40,271 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-3632713526855093302_2498 src: /192.168.0.193:59345 dest:
> /192.168.0.175:50010
> 2009-06-02 12:14:41,230 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:59345, dest: /192.168.0.175:50010, bytes: 20739360, op:
> HDFS_WRITE, cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-3632713526855093302_2498
> 2009-06-02 12:14:41,231 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-3632713526855093302_2498 terminating
> 2009-06-02 12:14:41,484 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_2475895093255279384_2500 src: /192.168.0.193:59357 dest:
> /192.168.0.175:50010
> 2009-06-02 12:14:43,727 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-5066455597866254073_2486 file
> /data/hadoop-datastore/current/blk_-5066455597866254073
> 2009-06-02 12:14:54,199 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:59357, dest: /192.168.0.175:50010, bytes: 10893600, op:
> HDFS_WRITE, cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_2475895093255279384_2500
> 2009-06-02 12:14:54,200 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_2475895093255279384_2500 terminating
> 2009-06-02 12:15:05,863 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_3906931575379674856_2506 src: /192.168.0.193:59385 dest:
> /192.168.0.175:50010
> 2009-06-02 12:15:06,589 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:59385, dest: /192.168.0.175:50010, bytes: 11819362, op:
> HDFS_WRITE, cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_3906931575379674856_2506
> 2009-06-02 12:15:06,590 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_3906931575379674856_2506 terminating
> 2009-06-02 12:15:06,628 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_1104353063740638283_2506 src: /192.168.0.193:59387 dest:
> /192.168.0.175:50010
> 2009-06-02 12:15:06,633 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:59387, dest: /192.168.0.175:50010, bytes: 27402, op:
> HDFS_WRITE, cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_1104353063740638283_2506
> 2009-06-02 12:15:06,633 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_1104353063740638283_2506 terminating
> 2009-06-02 12:15:07,191 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-5517132013528737139_2519 src: /192.168.0.193:59403 dest:
> /192.168.0.175:50010
> 2009-06-02 12:15:07,195 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:59403, dest: /192.168.0.175:50010, bytes: 41, op:
> HDFS_WRITE,
> cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-5517132013528737139_2519
> 2009-06-02 12:15:07,195 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-5517132013528737139_2519 terminating
> 2009-06-02 12:15:13,744 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-6636143128531703017_2450 file
> /data/hadoop-datastore/current/blk_-6636143128531703017
> 2009-06-02 12:15:13,758 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-3632713526855093302_2498 file
> /data/hadoop-datastore/current/blk_-3632713526855093302
> 2009-06-02 12:15:13,759 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-1486458067537042098_2495 file
> /data/hadoop-datastore/current/blk_-1486458067537042098
> 2009-06-02 12:16:55,410 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-2882612487393794888_2540 src: /192.168.0.193:59521 dest:
> /192.168.0.175:50010
> 2009-06-02 12:16:55,415 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:59521, dest: /192.168.0.175:50010, bytes: 110, op:
> HDFS_WRITE, cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-2882612487393794888_2540
> 2009-06-02 12:16:55,415 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-2882612487393794888_2540 terminating
> 2009-06-02 12:17:01,895 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-5517132013528737139_2519 file
> /data/hadoop-datastore/current/blk_-5517132013528737139
> 2009-06-02 12:17:56,840 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-5045036725303448421_2546 src: /192.168.0.193:39258 dest:
> /192.168.0.175:50010
> 2009-06-02 12:17:56,845 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:39258, dest: /192.168.0.175:50010, bytes: 117, op:
> HDFS_WRITE, cliID: DFSClient_632050003, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-5045036725303448421_2546
> 2009-06-02 12:17:56,845 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-5045036725303448421_2546 terminating
> 2009-06-02 12:18:32,037 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 32 blocks
> got processed in 21 msecs
> 2009-06-02 12:19:52,477 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.194:41274, dest: /192.168.0.175:50010, bytes: 12792946, op:
> HDFS_WRITE, cliID: DFSClient_1389518311, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_8099366148547328126_2340
> 2009-06-02 12:19:52,478 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_8099366148547328126_2340 terminating
> 2009-06-02 12:19:52,730 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.175:50010, dest: /192.168.0.181:38194, bytes: 10978708, op:
> HDFS_READ, cliID: DFSClient_400382585, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_2475895093255279384_2500
> 2009-06-02 12:19:58,051 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-2411001974305504436_2549 src: /192.168.0.193:39260 dest:
> /192.168.0.175:50010
> 2009-06-02 12:19:58,053 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-8510344733259320018_2549 src: /192.168.0.191:37451 dest:
> /192.168.0.175:50010
>
>                                2009-06-02 12:19:58,053 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-8510344733259320018_2549 src: /192.168.0.191:37451 dest:
> /192.168.0.175:50010
> 2009-06-02 12:20:02,142 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-5045036725303448421_2546 file
> /data/hadoop-datastore/current/blk_-5045036725303448421
> 2009-06-02 12:20:02,150 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_2475895093255279384_2500 file
> /data/hadoop-datastore/current/blk_2475895093255279384
> 2009-06-02 12:20:33,895 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls
> recoverBlock(block=blk_-2411001974305504436_2549,
> targets=[192.168.0.191:50010, 192.168.0.180:50010, 192.168.0.194:50010,
> 192.168.0.179:50010, 192.168.0.193:50010, 192.168.0.175:50010])
> 2009-06-02 12:20:34,574 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls
> recoverBlock(block=blk_-8510344733259320018_2549,
> targets=[192.168.0.194:50010, 192.168.0.196:50010, 192.168.0.180:50010,
> 192.168.0.179:50010, 192.168.0.191:50010, 192.168.0.175:50010])
> 2009-06-02 12:20:47,130 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock
> for block blk_-8510344733259320018_2549 java.net.SocketException:
> Connection
> reset
> 2009-06-02 12:20:47,131 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-8510344733259320018_2549 Interrupted.
> 2009-06-02 12:20:47,131 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-8510344733259320018_2549 terminating
> 2009-06-02 12:20:47,131 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_-8510344733259320018_2549 received exception java.net.SocketException:
> Connection reset
> 2009-06-02 12:20:47,131 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.0.175:50010,
> storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.net.SocketException: Connection reset
>        at java.net.SocketInputStream.read(SocketInputStream.java:168)
>        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>        at java.io.DataInputStream.read(DataInputStream.java:132)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:261)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:308)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>        at java.lang.Thread.run(Thread.java:619)
> 2009-06-02 12:20:47,138 WARN
> org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to
> updateBlock (newblock=blk_-8510344733259320018_2551,
> datanode=192.168.0.179:50010)
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block
> blk_-8510344733259320018_2551 length is 11177984 does not match block file
> length 11899392
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1259)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.tryUpdateBlock(FSDataset.java:898)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.updateBlock(FSDataset.java:810)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.updateBlock(DataNode.java:1381)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
>
>        at org.apache.hadoop.ipc.Client.call(Client.java:697)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>        at $Proxy6.updateBlock(Unknown Source)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1510)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1479)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1545)
>        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>
>                                at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
> 2009-06-02 12:20:47,142 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> oldblock=blk_-8510344733259320018_2549(length=11184128),
> newblock=blk_-8510344733259320018_2551(length=11177984),
> datanode=192.168.0.175:50010
> 2009-06-02 12:20:47,193 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-8510344733259320018_2551 src: /192.168.0.191:37454 dest:
> /192.168.0.175:50010
> 2009-06-02 12:20:47,193 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block
> for append blk_-8510344733259320018_2551
> 2009-06-02 12:20:47,237 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Changing block file offset
> of block blk_-8510344733259320018_2551 from 0 to 11119104 meta file offset
> to 86875
> 2009-06-02 12:20:50,326 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.191:37454, dest: /192.168.0.175:50010, bytes: 67108864, op:
> HDFS_WRITE, cliID: DFSClient_400382585, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-8510344733259320018_2551
> 2009-06-02 12:20:50,534 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-8510344733259320018_2551 terminating
> 2009-06-02 12:20:50,724 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-5221330074279244492_2551 src: /192.168.0.179:37758 dest:
> /192.168.0.175:50010
> 2009-06-02 12:22:15,009 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_-3886449064876654030_2199
> 2009-06-02 12:24:01,494 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock
> for block blk_-2411001974305504436_2549 java.net.SocketException:
> Connection
> reset
> 2009-06-02 12:24:01,495 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-2411001974305504436_2549 Interrupted.
> 2009-06-02 12:24:01,495 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-2411001974305504436_2549 terminating
> 2009-06-02 12:24:01,495 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_-2411001974305504436_2549 received exception java.net.SocketException:
> Connection reset
> 2009-06-02 12:24:01,495 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.0.175:50010,
> storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.net.SocketException: Connection reset
>        at java.net.SocketInputStream.read(SocketInputStream.java:168)
>        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>        at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>        at java.io.DataInputStream.read(DataInputStream.java:132)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:261)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:351)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>        at java.lang.Thread.run(Thread.java:619)
> 2009-06-02 12:24:01,501 WARN
> org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to
> updateBlock (newblock=blk_-2411001974305504436_2550,
> datanode=192.168.0.179:50010)
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block
> blk_-2411001974305504436_2550 length is 3174400 does not match block file
> length 3511296
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1259)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.tryUpdateBlock(FSDataset.java:898)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.updateBlock(FSDataset.java:810)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.updateBlock(DataNode.java:1381)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
>
>        at org.apache.hadoop.ipc.Client.call(Client.java:697)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>
>                at org.apache.hadoop.ipc.Client.call(Client.java:697)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>        at $Proxy6.updateBlock(Unknown Source)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1510)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1479)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1545)
>        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
> 2009-06-02 12:24:01,505 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> oldblock=blk_-2411001974305504436_2549(length=3186176),
> newblock=blk_-2411001974305504436_2550(length=3174400),
> datanode=192.168.0.175:50010
> 2009-06-02 12:24:01,596 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-2411001974305504436_2550 src: /192.168.0.193:53232 dest:
> /192.168.0.175:50010
> 2009-06-02 12:24:01,596 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block
> for append blk_-2411001974305504436_2550
> 2009-06-02 12:24:01,615 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Changing block file offset
> of block blk_-2411001974305504436_2550 from 0 to 3121152 meta file offset
> to
> 24391
> 2009-06-02 12:24:11,234 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.193:53232, dest: /192.168.0.175:50010, bytes: 18848146, op:
> HDFS_WRITE, cliID: DFSClient_400382585, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-2411001974305504436_2550
> 2009-06-02 12:24:11,234 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-2411001974305504436_2550 terminating
> 2009-06-02 12:24:11,502 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.179:37758, dest: /192.168.0.175:50010, bytes: 398267, op:
> HDFS_WRITE, cliID: DFSClient_400382585, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-5221330074279244492_2551
> 2009-06-02 12:24:11,503 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-5221330074279244492_2551 terminating
> 2009-06-02 12:24:14,656 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.175:50010, dest: /192.168.0.191:57293, bytes: 27618, op:
> HDFS_READ, cliID: DFSClient_679571661, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_1104353063740638283_2506
> 2009-06-02 12:24:15,307 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_6459724563472879206_2553 src: /192.168.0.191:57304 dest:
> /192.168.0.175:50010
> 2009-06-02 12:24:22,967 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-4146349943716507582_2556 src: /192.168.0.191:57319 dest:
> /192.168.0.175:50010
> 2009-06-02 12:24:22,971 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.191:57319, dest: /192.168.0.175:50010, bytes: 9, op:
> HDFS_WRITE,
> cliID: DFSClient_679571661, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_-4146349943716507582_2556
> 2009-06-02 12:24:22,971 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-4146349943716507582_2556 terminating
> 2009-06-02 12:24:24,141 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.191:57304, dest: /192.168.0.175:50010, bytes: 67108864, op:
> HDFS_WRITE, cliID: DFSClient_679571661, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_6459724563472879206_2553
> 2009-06-02 12:24:24,141 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_6459724563472879206_2553 terminating
> 2009-06-02 12:24:30,999 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /192.168.0.175:50010, dest: /192.168.0.191:57292, bytes: 148608, op:
> HDFS_READ, cliID: DFSClient_679571661, srvID:
> DS-661194258-192.168.0.175-50010-1243890204046, blockid:
> blk_3906931575379674856_2506
> 2009-06-02 12:24:31,415 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-4614683533809303164_2559 src: /192.168.0.191:57345 dest:
> /192.168.0.175:50010
> 2009-06-02 12:24:38,554 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.0.175:50010,
> storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075,
> ipcPort=50020) Starting thread to transfer block
> blk_-2411001974305504436_2550 to 192.168.0.179:50010, 192.168.0.196:50010
> 2009-06-02 12:24:38,560 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.0.175:50010,
> storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075,
> ipcPort=50020):Failed to transfer blk_-2411001974305504436_2550 to
> 192.168.0.179:50010 got java.net.SocketException: Connection reset
>        at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
>        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:319)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1108)
>        at java.lang.Thread.run(Thread.java:619)
>
> 2009-06-02 12:24:47,562 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-8510344733259320018_2551 file
> /data/hadoop-datastore/current/blk_-8510344733259320018
>
> 2009-06-02 12:24:47,562 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-8510344733259320018_2551 file
> /data/hadoop-datastore/current/blk_-8510344733259320018
> 2009-06-02 12:24:47,562 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> blk_-5221330074279244492_2551 file
> /data/hadoop-datastore/current/blk_-5221330074279244492
> 2009-06-02 12:24:51,350 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls
> recoverBlock(block=blk_-4614683533809303164_2559,
> targets=[192.168.0.175:50010])
> 2009-06-02 12:24:51,353 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> oldblock=blk_-4614683533809303164_2559(length=65804288),
> newblock=blk_-4614683533809303164_2560(length=65804288),
> datanode=192.168.0.175:50010
> 2009-06-02 12:25:04,002 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock
> for block blk_-4614683533809303164_2559
> java.nio.channels.ClosedByInterruptException
> 2009-06-02 12:25:04,003 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-4614683533809303164_2559 Interrupted.
> 2009-06-02 12:25:04,003 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-4614683533809303164_2559 terminating
> 2009-06-02 12:25:04,003 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_-4614683533809303164_2559 received exception
> java.nio.channels.ClosedByInterruptException
> 2009-06-02 12:25:04,003 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.0.175:50010,
> storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.nio.channels.ClosedByInterruptException
>        at
>
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>        at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:271)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getChannelPosition(FSDataset.java:1090)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.setBlockPosition(BlockReceiver.java:604)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:396)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>        at java.lang.Thread.run(Thread.java:619)
> 2009-06-02 12:25:04,037 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_-4614683533809303164_2560 src: /192.168.0.191:57348 dest:
> /192.168.0.175:50010
> 2009-06-02 12:25:04,037 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block
> for append blk_-4614683533809303164_2560
> 2009-06-02 12:25:22,924 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Changing block file offset
> of block blk_-4614683533809303164_2560 from 0 to 65739264 meta file offset
> to 513595
> 2009-06-02 12:25:22,928 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock
> for block blk_-4614683533809303164_2560 java.net.SocketException:
> Connection
> reset
> 2009-06-02 12:25:22,928 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-4614683533809303164_2560 Interrupted.
> 2009-06-02 12:25:22,928 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> blk_-4614683533809303164_2560 terminating
> 2009-06-02 12:25:22,928 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_-4614683533809303164_2560 received exception java.net.SocketException:
> Connection reset
> 2009-06-02 12:25:22,928 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.0.175:50010,
> storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.net.SocketException: Connection reset
>        at java.net.SocketInputStream.read(SocketInputStream.java:168)
>        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>        at java.io.DataInputStream.read(DataInputStream.java:132)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:261)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:351)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>        at java.lang.Thread.run(Thread.java:619)
> 2009-06-02 12:32:20,289 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_-4146349943716507582_2556
> 2009-06-02 12:35:42,249 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_6405527929673442526_2329
> 2009-06-02 12:41:01,502 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_8099366148547328126_2340
>
> =====================================================
>
>
>
> Andrew Purtell-2 wrote:
> >
> > Thanks for the system info. CPU and RAM resources look good.
> >
> > You are not able to insert enough data into HBase to trigger splits (or
> > they are delayed due to intense write activity), because the HDFS layer
> > underneath is not able to keep up. If you have already tried the HDFS
> > related suggestions on the HBase troubleshooting page
> > (http://wiki.apache.org/hadoop/Hbase/Troubleshooting) and are still
> having
> > problems, adding additional HDFS data nodes to spread the load may help,
> > but it depends where is the bottleneck. What network do you have
> > connecting these boxes? Gigabit Ethernet? Fast (megabit) Ethernet?
> >
>
>
> Andrew Purtell-2 wrote:
> >
> > Something else to try here is forcing splits early in the upload process.
> > You can use the hbase shell or the HBaseAdmin client API to signal the
> > master to ask the regionserver(s) hosting a table to split it regardless
> > of whether the stores have reached the split threshold or not. This type
> > of split request is advisory, and will not happen if there is no data in
> a
> > region or if it has just split already and a compaction is pending. The
> > strategy here is to actually use all of the nodes you have deployed, or,
> > if you add more, to use those as well. Force enough splits so there is at
> > least one region of the table hosted by a region server on every node. It
> > could be something like:
> >
> >     HTable table = ...;
> >     HBaseAdmin admin = ...;
> >     int nodes = getAvailableNodes();
> >     while (table.getStartKeys().length < nodes) {
> >         // upload 100 rows
> >         // ...
> >         admin.split(table.getTableName());
> >         // give some time for things to settle down
> >         Thread.sleep(30 * 1000); // or 60 * 1000
> >     }
> >     // upload the rest
> >     // ...
> >
> > There is no HBase API to stand in for getAvailableNodes() -- yet... I
> will
> > make one now, that seems useful -- but if you have co-deployed mapreduce
> > with your region servers, you could use JobClient.getClusterStatus() to
> > programmatically determine the size of the cluster. See
> >
> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobClient.html#getClusterStatus()<http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobClient.html#getClusterStatus%28%29>
> >
> >
> I would like to try this, but I'm not running map/reduce yet.
> --
> View this message in context:
> http://www.nabble.com/HBase-looses-regions.-tp23657983p23839676.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>

Reply via email to