Whats happening on "bad datanode 192.168.0.175:50010" at the time? Its struggling badly. Appends are taking > 10seconds. St.Ack
On Tue, Jun 2, 2009 at 1:00 PM, llpind <[email protected]> wrote: > > Here are the changes I've made. > - Moved to Gigabit ethernet > - upgraded to HBase 0.19.3 > - Run upload program from master box (instead of local) > - Added 5 more nodes (Now w/8 region servers and 1 master. same boxes hold > hadoop datanodes). > > > The upload got much further (45M+ of 58M), but I still lost one region > server between 45M-46M. > here is the HBase log from that region server: > ===================================================== > 2009-06-02 11:21:24,334 INFO > org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on > region > tableA,row_keyA,1243964035709 > 2009-06-02 11:21:24,336 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > 1008565632/entity: no store files to compact > 2009-06-02 11:21:24,346 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > Compaction size of 1008565632/link: 361.0m; Skipped 2 file(s), size: > 337039946 > 2009-06-02 11:21:24,392 INFO org.apache.hadoop.hbase.regionserver.HLog: > Closed > > hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966872067, > entries=115001. New log writer: > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966884387 > 2009-06-02 11:21:24,392 DEBUG org.apache.hadoop.hbase.regionserver.HLog: > Last sequence written is empty. Deleting all old hlogs > 2009-06-02 11:21:24,392 INFO org.apache.hadoop.hbase.regionserver.HLog: > removing old log file > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966854582 whose > highest sequence/edit id is 26714173 > 2009-06-02 11:21:24,395 INFO org.apache.hadoop.hbase.regionserver.HLog: > removing old log file > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966859798 whose > highest sequence/edit id is 26819173 > 2009-06-02 11:21:24,401 INFO org.apache.hadoop.hbase.regionserver.HLog: > removing old log file > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966867599 whose > highest sequence/edit id is 26924173 > 2009-06-02 11:21:24,404 INFO org.apache.hadoop.hbase.regionserver.HLog: > removing old log file > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966872067 whose > highest sequence/edit id is 27039174 > 2009-06-02 11:21:24,412 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > Started compaction of 2 file(s) into > /hbase/tableA/compaction.dir/1008565632/link/mapfiles/1956913417104407709 > 2009-06-02 11:21:37,145 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.net.SocketTimeoutException: 10000 millis timeout while > waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/192.168.0.175:36139 > remote=/192.168.0.175:50010] > at > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209) > > 2009-06-02 11:21:37,146 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC > Server handler 1 on 60020 took 10003ms appending an edit to HLog; > editcount=41936 > 2009-06-02 11:21:37,146 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7404574899445503675_2363 bad datanode[0] > 192.168.0.175:50010 > 2009-06-02 11:21:37,147 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7404574899445503675_2363 in pipeline > 192.168.0.175:50010, 192.168.0.194:50010: bad datanode 192.168.0.175:50010 > 2009-06-02 11:21:37,405 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.net.SocketTimeoutException: 10000 millis timeout while > waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/192.168.0.175:36141 > remote=/192.168.0.175:50010] > at > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209) > > 2009-06-02 11:21:37,406 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-1937654560341865886_2364 bad datanode[0] > 192.168.0.175:50010 > 2009-06-02 11:21:37,406 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-1937654560341865886_2364 in pipeline > 192.168.0.175:50010, 192.168.0.195:50010: bad datanode 192.168.0.175:50010 > 2009-06-02 11:21:40,754 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC > Server handler 1 on 60020 took 3608ms appending an edit to HLog; > editcount=41940 > 2009-06-02 11:21:40,806 WARN org.apache.hadoop.hbase.regionserver.HLog: > regionserver/0.0.0.0:60020.logFlusher took 11285ms optional sync to HLog; > editcount=45000 > 2009-06-02 11:21:44,692 INFO org.apache.hadoop.hbase.regionserver.HLog: > Closed > > hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966878324, > entries=105000. New log writer: > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966904686 > 2009-06-02 11:21:44,692 DEBUG org.apache.hadoop.hbase.regionserver.HLog: > Found 0 logs to remove out of total 1; oldest outstanding seqnum is > 27039175 from region tableA,row_keyA,1243964035709 > 2009-06-02 11:21:47,133 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > moving > /hbase/tableA/compaction.dir/1008565632/link/mapfiles/1956913417104407709 > to > /hbase/tableA/1008565632/link/mapfiles/8666639134401806411 > 2009-06-02 11:21:47,228 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > Completed compaction of 1008565632/link store size is 360.5m > 2009-06-02 11:21:47,244 INFO org.apache.hadoop.hbase.regionserver.HRegion: > compaction completed on region tableA,row_keyA,1243964035709 in 22sec > 2009-06-02 11:21:47,276 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Startkey and midkey are same, not splitting > > 2009-06-02 11:21:47,133 DEBUG > org.apache.hadoop.hbase.regionserver.HStore: > moving > /hbase/tableA/compaction.dir/1008565632/link/mapfiles/1956913417104407709 > to > /hbase/tableA/1008565632/link/mapfiles/8666639134401806411 > 2009-06-02 11:21:47,228 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > Completed compaction of 1008565632/link store size is 360.5m > 2009-06-02 11:21:47,244 INFO org.apache.hadoop.hbase.regionserver.HRegion: > compaction completed on region tableA,row_keyA,1243964035709 in 22sec > 2009-06-02 11:21:47,276 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Startkey and midkey are same, not splitting > 2009-06-02 11:21:50,460 INFO org.apache.hadoop.hbase.regionserver.HLog: > Closed > > hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966884387, > entries=105000. New log writer: > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966910456 > 2009-06-02 11:21:50,460 DEBUG org.apache.hadoop.hbase.regionserver.HLog: > Found 0 logs to remove out of total 2; oldest outstanding seqnum is > 27039175 from region tableA,row_keyA,1243964035709 > 2009-06-02 11:21:54,857 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Flush requested on tableA,row_keyA,1243964035709 > 2009-06-02 11:21:54,857 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Started memcache flush for region tableA,row_keyA,1243964035709. Current > region memcache size 64.5m > 2009-06-02 11:22:05,786 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.net.SocketTimeoutException: 10000 millis timeout while > waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/192.168.0.175:36162 > remote=/192.168.0.175:50010] > at > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209) > > 2009-06-02 11:22:05,787 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7847038903223219269_2371 bad datanode[0] > 192.168.0.175:50010 > 2009-06-02 11:22:05,787 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7847038903223219269_2371 in pipeline > 192.168.0.175:50010, 192.168.0.179:50010: bad datanode 192.168.0.175:50010 > 2009-06-02 11:22:12,025 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Blocking updates for 'IPC Server handler 3 on 60020' on region > tableA,row_keyA,1243964035709: Memcache size 128.2m is >= than blocking > 128.0m size > 2009-06-02 11:22:52,772 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7847038903223219269_2371 failed because recovery > from primary datanode 192.168.0.179:50010 failed 1 times. Will retry... > 2009-06-02 11:22:52,776 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7847038903223219269_2371 bad datanode[0] > 192.168.0.175:50010 > 2009-06-02 11:22:52,776 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7847038903223219269_2371 in pipeline > 192.168.0.175:50010, 192.168.0.179:50010: bad datanode 192.168.0.175:50010 > 2009-06-02 11:22:58,311 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.net.SocketTimeoutException: 5000 millis timeout while > waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/192.168.0.175:58191 > remote=/192.168.0.179:50010] > at > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209) > > 2009-06-02 11:22:58,311 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_7847038903223219269_2373 bad datanode[0] > 192.168.0.179:50010 > 2009-06-02 11:22:58,351 FATAL > org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay of hlog > required. Forcing server shutdown > org.apache.hadoop.hbase.DroppedSnapshotException: region: > tableA,row_keyA,1243964035709 > at > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:903) > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:796) > at > > org.apache.hadoop.hbase.regionserver.MemcacheFlusher.flushRegion(MemcacheFlusher.java:265) > at > > org.apache.hadoop.hbase.regionserver.MemcacheFlusher.run(MemcacheFlusher.java:148) > Caused by: java.io.IOException: All datanodes 192.168.0.179:50010 are bad. > Aborting... > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2444) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1996) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) > > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1996) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) > 2009-06-02 11:22:58,354 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=0.0, regions=1, stores=2, storefiles=3, storefileIndexSize=0, > memcacheSize=128, usedHeap=172, maxHeap=2999 > 2009-06-02 11:22:58,355 INFO > org.apache.hadoop.hbase.regionserver.MemcacheFlusher: > regionserver/0.0.0.0:60020.cacheFlusher exiting > 2009-06-02 11:22:58,370 INFO org.apache.hadoop.hbase.regionserver.HLog: > Closed > > hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966904686, > entries=440000. New log writer: > /hbase/log_192.168.0.175_1243890243326_60020/hlog.dat.1243966978364 > 2009-06-02 11:22:58,370 DEBUG org.apache.hadoop.hbase.regionserver.HLog: > Found 0 logs to remove out of total 3; oldest outstanding seqnum is > 27039175 from region tableA,row_keyA,1243964035709 > 2009-06-02 11:22:58,370 INFO > org.apache.hadoop.hbase.regionserver.LogRoller: > LogRoller exiting. > 2009-06-02 11:23:00,839 INFO > org.apache.hadoop.hbase.regionserver.LogFlusher: > regionserver/0.0.0.0:60020.logFlusher exiting > 2009-06-02 11:23:01,347 DEBUG org.apache.hadoop.hbase.RegionHistorian: > Offlined > 2009-06-02 11:23:01,347 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 60020 > 2009-06-02 11:23:01,348 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Flush requested on tableA,row_keyA,1243964035709 > 2009-06-02 11:23:01,348 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > IPC > Server listener on 60020 > 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 0 on 60020: exiting > 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 5 on 60020: exiting > 2009-06-02 11:23:01,348 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 2 on 60020: exiting > 2009-06-02 11:23:01,348 INFO org.mortbay.util.ThreadedServer: Stopping > Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030] > 2009-06-02 11:23:01,350 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > IPC > Server Responder > 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 7 on 60020: exiting > 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 6 on 60020: exiting > 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 8 on 60020: exiting > 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 4 on 60020: exiting > 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 9 on 60020: exiting > 2009-06-02 11:23:01,349 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 1 on 60020: exiting > 2009-06-02 11:23:01,355 INFO org.mortbay.http.SocketListener: Stopped > SocketListener on 0.0.0.0:60030 > 2009-06-02 11:23:02,531 INFO org.mortbay.util.Container: Stopped > HttpContext[/logs,/logs] > 2009-06-02 11:23:02,532 INFO org.mortbay.util.Container: Stopped > org.mortbay.jetty.servlet.webapplicationhand...@79884a40 > 2009-06-02 11:23:03,294 INFO org.mortbay.util.Container: Stopped > WebApplicationContext[/static,/static] > 2009-06-02 11:23:03,295 INFO org.mortbay.util.Container: Stopped > org.mortbay.jetty.servlet.webapplicationhand...@794e113b > 2009-06-02 11:23:04,438 INFO org.mortbay.util.Container: Stopped > WebApplicationContext[/,/] > 2009-06-02 11:23:04,438 INFO org.mortbay.util.Container: Stopped > org.mortbay.jetty.ser...@4e4ee70b > 2009-06-02 11:23:04,438 DEBUG org.apache.hadoop.hbase.regionserver.HLog: > closing log writer in > hdfs://ats181:54310/hbase/log_192.168.0.175_1243890243326_60020 > 2009-06-02 11:23:04,439 INFO > org.apache.hadoop.hbase.regionserver.CompactSplitThread: > regionserver/0.0.0.0:60020.compactor exiting > 2009-06-02 11:23:04,439 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: > regionserver/0.0.0.0:60020.majorCompactionChecker exiting > 2009-06-02 11:23:04,814 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed hlog > 2009-06-02 11:23:04,815 DEBUG > org.apache.hadoop.hbase.regionserver.HRegionServer: closing region > tableA,row_keyA,1243964035709 > 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Closing tableA,row_keyA,1243964035709: compactions & flushes disabled > 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Updates disabled for region, no outstanding scanners on > tableA,row_keyA,1243964035709 > 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > No more row locks outstanding on region tableA,row_keyA,1243964035709 > 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > closed 1008565632/entity > 2009-06-02 11:23:04,815 DEBUG org.apache.hadoop.hbase.regionserver.HStore: > closed 1008565632/link > 2009-06-02 11:23:04,815 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Closed tableA,row_keyA,1243964035709 > 2009-06-02 11:23:05,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at: > 192.168.0.175:60020 > 2009-06-02 11:23:07,463 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting > 2009-06-02 11:23:07,567 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > regionserver/0.0.0.0:60020 exiting > 2009-06-02 11:23:08,511 INFO org.apache.hadoop.hbase.Leases: > regionserver/0.0.0.0:60020.leaseChecker closing leases > 2009-06-02 11:23:08,511 INFO org.apache.hadoop.hbase.Leases: > regionserver/0.0.0.0:60020.leaseChecker closed leases > 2009-06-02 11:23:08,512 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown > thread. > 2009-06-02 11:23:08,513 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread > complete > > ===================================================== > > > > Here is the log from the datanode: > > > ===================================================== > 2009-06-02 12:14:27,084 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:59337, dest: /192.168.0.175:50010, bytes: 9, op: > HDFS_WRITE, > cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-1486458067537042098_2495 > 2009-06-02 12:14:27,085 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-1486458067537042098_2495 terminating > 2009-06-02 12:14:34,702 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-428154950044527208_2466 file > /data/hadoop-datastore/current/blk_-428154950044527208 > 2009-06-02 12:14:40,271 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-3632713526855093302_2498 src: /192.168.0.193:59345 dest: > /192.168.0.175:50010 > 2009-06-02 12:14:41,230 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:59345, dest: /192.168.0.175:50010, bytes: 20739360, op: > HDFS_WRITE, cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-3632713526855093302_2498 > 2009-06-02 12:14:41,231 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-3632713526855093302_2498 terminating > 2009-06-02 12:14:41,484 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_2475895093255279384_2500 src: /192.168.0.193:59357 dest: > /192.168.0.175:50010 > 2009-06-02 12:14:43,727 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-5066455597866254073_2486 file > /data/hadoop-datastore/current/blk_-5066455597866254073 > 2009-06-02 12:14:54,199 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:59357, dest: /192.168.0.175:50010, bytes: 10893600, op: > HDFS_WRITE, cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_2475895093255279384_2500 > 2009-06-02 12:14:54,200 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_2475895093255279384_2500 terminating > 2009-06-02 12:15:05,863 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_3906931575379674856_2506 src: /192.168.0.193:59385 dest: > /192.168.0.175:50010 > 2009-06-02 12:15:06,589 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:59385, dest: /192.168.0.175:50010, bytes: 11819362, op: > HDFS_WRITE, cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_3906931575379674856_2506 > 2009-06-02 12:15:06,590 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_3906931575379674856_2506 terminating > 2009-06-02 12:15:06,628 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_1104353063740638283_2506 src: /192.168.0.193:59387 dest: > /192.168.0.175:50010 > 2009-06-02 12:15:06,633 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:59387, dest: /192.168.0.175:50010, bytes: 27402, op: > HDFS_WRITE, cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_1104353063740638283_2506 > 2009-06-02 12:15:06,633 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_1104353063740638283_2506 terminating > 2009-06-02 12:15:07,191 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-5517132013528737139_2519 src: /192.168.0.193:59403 dest: > /192.168.0.175:50010 > 2009-06-02 12:15:07,195 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:59403, dest: /192.168.0.175:50010, bytes: 41, op: > HDFS_WRITE, > cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-5517132013528737139_2519 > 2009-06-02 12:15:07,195 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-5517132013528737139_2519 terminating > 2009-06-02 12:15:13,744 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-6636143128531703017_2450 file > /data/hadoop-datastore/current/blk_-6636143128531703017 > 2009-06-02 12:15:13,758 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-3632713526855093302_2498 file > /data/hadoop-datastore/current/blk_-3632713526855093302 > 2009-06-02 12:15:13,759 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-1486458067537042098_2495 file > /data/hadoop-datastore/current/blk_-1486458067537042098 > 2009-06-02 12:16:55,410 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-2882612487393794888_2540 src: /192.168.0.193:59521 dest: > /192.168.0.175:50010 > 2009-06-02 12:16:55,415 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:59521, dest: /192.168.0.175:50010, bytes: 110, op: > HDFS_WRITE, cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-2882612487393794888_2540 > 2009-06-02 12:16:55,415 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-2882612487393794888_2540 terminating > 2009-06-02 12:17:01,895 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-5517132013528737139_2519 file > /data/hadoop-datastore/current/blk_-5517132013528737139 > 2009-06-02 12:17:56,840 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-5045036725303448421_2546 src: /192.168.0.193:39258 dest: > /192.168.0.175:50010 > 2009-06-02 12:17:56,845 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:39258, dest: /192.168.0.175:50010, bytes: 117, op: > HDFS_WRITE, cliID: DFSClient_632050003, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-5045036725303448421_2546 > 2009-06-02 12:17:56,845 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-5045036725303448421_2546 terminating > 2009-06-02 12:18:32,037 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 32 blocks > got processed in 21 msecs > 2009-06-02 12:19:52,477 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.194:41274, dest: /192.168.0.175:50010, bytes: 12792946, op: > HDFS_WRITE, cliID: DFSClient_1389518311, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_8099366148547328126_2340 > 2009-06-02 12:19:52,478 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_8099366148547328126_2340 terminating > 2009-06-02 12:19:52,730 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.175:50010, dest: /192.168.0.181:38194, bytes: 10978708, op: > HDFS_READ, cliID: DFSClient_400382585, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_2475895093255279384_2500 > 2009-06-02 12:19:58,051 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-2411001974305504436_2549 src: /192.168.0.193:39260 dest: > /192.168.0.175:50010 > 2009-06-02 12:19:58,053 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-8510344733259320018_2549 src: /192.168.0.191:37451 dest: > /192.168.0.175:50010 > > 2009-06-02 12:19:58,053 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-8510344733259320018_2549 src: /192.168.0.191:37451 dest: > /192.168.0.175:50010 > 2009-06-02 12:20:02,142 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-5045036725303448421_2546 file > /data/hadoop-datastore/current/blk_-5045036725303448421 > 2009-06-02 12:20:02,150 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_2475895093255279384_2500 file > /data/hadoop-datastore/current/blk_2475895093255279384 > 2009-06-02 12:20:33,895 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls > recoverBlock(block=blk_-2411001974305504436_2549, > targets=[192.168.0.191:50010, 192.168.0.180:50010, 192.168.0.194:50010, > 192.168.0.179:50010, 192.168.0.193:50010, 192.168.0.175:50010]) > 2009-06-02 12:20:34,574 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls > recoverBlock(block=blk_-8510344733259320018_2549, > targets=[192.168.0.194:50010, 192.168.0.196:50010, 192.168.0.180:50010, > 192.168.0.179:50010, 192.168.0.191:50010, 192.168.0.175:50010]) > 2009-06-02 12:20:47,130 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock > for block blk_-8510344733259320018_2549 java.net.SocketException: > Connection > reset > 2009-06-02 12:20:47,131 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-8510344733259320018_2549 Interrupted. > 2009-06-02 12:20:47,131 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-8510344733259320018_2549 terminating > 2009-06-02 12:20:47,131 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_-8510344733259320018_2549 received exception java.net.SocketException: > Connection reset > 2009-06-02 12:20:47,131 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(192.168.0.175:50010, > storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075, > ipcPort=50020):DataXceiver > java.net.SocketException: Connection reset > at java.net.SocketInputStream.read(SocketInputStream.java:168) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at java.io.DataInputStream.read(DataInputStream.java:132) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:261) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:308) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) > at java.lang.Thread.run(Thread.java:619) > 2009-06-02 12:20:47,138 WARN > org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to > updateBlock (newblock=blk_-8510344733259320018_2551, > datanode=192.168.0.179:50010) > org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block > blk_-8510344733259320018_2551 length is 11177984 does not match block file > length 11899392 > at > > org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1259) > at > > org.apache.hadoop.hdfs.server.datanode.FSDataset.tryUpdateBlock(FSDataset.java:898) > at > > org.apache.hadoop.hdfs.server.datanode.FSDataset.updateBlock(FSDataset.java:810) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.updateBlock(DataNode.java:1381) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) > > at org.apache.hadoop.ipc.Client.call(Client.java:697) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) > at $Proxy6.updateBlock(Unknown Source) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1510) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1479) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1545) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) > 2009-06-02 12:20:47,142 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: > oldblock=blk_-8510344733259320018_2549(length=11184128), > newblock=blk_-8510344733259320018_2551(length=11177984), > datanode=192.168.0.175:50010 > 2009-06-02 12:20:47,193 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-8510344733259320018_2551 src: /192.168.0.191:37454 dest: > /192.168.0.175:50010 > 2009-06-02 12:20:47,193 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block > for append blk_-8510344733259320018_2551 > 2009-06-02 12:20:47,237 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Changing block file offset > of block blk_-8510344733259320018_2551 from 0 to 11119104 meta file offset > to 86875 > 2009-06-02 12:20:50,326 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.191:37454, dest: /192.168.0.175:50010, bytes: 67108864, op: > HDFS_WRITE, cliID: DFSClient_400382585, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-8510344733259320018_2551 > 2009-06-02 12:20:50,534 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-8510344733259320018_2551 terminating > 2009-06-02 12:20:50,724 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-5221330074279244492_2551 src: /192.168.0.179:37758 dest: > /192.168.0.175:50010 > 2009-06-02 12:22:15,009 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-3886449064876654030_2199 > 2009-06-02 12:24:01,494 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock > for block blk_-2411001974305504436_2549 java.net.SocketException: > Connection > reset > 2009-06-02 12:24:01,495 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-2411001974305504436_2549 Interrupted. > 2009-06-02 12:24:01,495 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-2411001974305504436_2549 terminating > 2009-06-02 12:24:01,495 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_-2411001974305504436_2549 received exception java.net.SocketException: > Connection reset > 2009-06-02 12:24:01,495 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(192.168.0.175:50010, > storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075, > ipcPort=50020):DataXceiver > java.net.SocketException: Connection reset > at java.net.SocketInputStream.read(SocketInputStream.java:168) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at java.io.DataInputStream.read(DataInputStream.java:132) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:261) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:351) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) > at java.lang.Thread.run(Thread.java:619) > 2009-06-02 12:24:01,501 WARN > org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to > updateBlock (newblock=blk_-2411001974305504436_2550, > datanode=192.168.0.179:50010) > org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block > blk_-2411001974305504436_2550 length is 3174400 does not match block file > length 3511296 > at > > org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1259) > at > > org.apache.hadoop.hdfs.server.datanode.FSDataset.tryUpdateBlock(FSDataset.java:898) > at > > org.apache.hadoop.hdfs.server.datanode.FSDataset.updateBlock(FSDataset.java:810) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.updateBlock(DataNode.java:1381) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) > > at org.apache.hadoop.ipc.Client.call(Client.java:697) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) > > at org.apache.hadoop.ipc.Client.call(Client.java:697) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) > at $Proxy6.updateBlock(Unknown Source) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1510) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1479) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1545) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) > 2009-06-02 12:24:01,505 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: > oldblock=blk_-2411001974305504436_2549(length=3186176), > newblock=blk_-2411001974305504436_2550(length=3174400), > datanode=192.168.0.175:50010 > 2009-06-02 12:24:01,596 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-2411001974305504436_2550 src: /192.168.0.193:53232 dest: > /192.168.0.175:50010 > 2009-06-02 12:24:01,596 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block > for append blk_-2411001974305504436_2550 > 2009-06-02 12:24:01,615 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Changing block file offset > of block blk_-2411001974305504436_2550 from 0 to 3121152 meta file offset > to > 24391 > 2009-06-02 12:24:11,234 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.193:53232, dest: /192.168.0.175:50010, bytes: 18848146, op: > HDFS_WRITE, cliID: DFSClient_400382585, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-2411001974305504436_2550 > 2009-06-02 12:24:11,234 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-2411001974305504436_2550 terminating > 2009-06-02 12:24:11,502 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.179:37758, dest: /192.168.0.175:50010, bytes: 398267, op: > HDFS_WRITE, cliID: DFSClient_400382585, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-5221330074279244492_2551 > 2009-06-02 12:24:11,503 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-5221330074279244492_2551 terminating > 2009-06-02 12:24:14,656 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.175:50010, dest: /192.168.0.191:57293, bytes: 27618, op: > HDFS_READ, cliID: DFSClient_679571661, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_1104353063740638283_2506 > 2009-06-02 12:24:15,307 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_6459724563472879206_2553 src: /192.168.0.191:57304 dest: > /192.168.0.175:50010 > 2009-06-02 12:24:22,967 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-4146349943716507582_2556 src: /192.168.0.191:57319 dest: > /192.168.0.175:50010 > 2009-06-02 12:24:22,971 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.191:57319, dest: /192.168.0.175:50010, bytes: 9, op: > HDFS_WRITE, > cliID: DFSClient_679571661, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_-4146349943716507582_2556 > 2009-06-02 12:24:22,971 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-4146349943716507582_2556 terminating > 2009-06-02 12:24:24,141 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.191:57304, dest: /192.168.0.175:50010, bytes: 67108864, op: > HDFS_WRITE, cliID: DFSClient_679571661, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_6459724563472879206_2553 > 2009-06-02 12:24:24,141 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_6459724563472879206_2553 terminating > 2009-06-02 12:24:30,999 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /192.168.0.175:50010, dest: /192.168.0.191:57292, bytes: 148608, op: > HDFS_READ, cliID: DFSClient_679571661, srvID: > DS-661194258-192.168.0.175-50010-1243890204046, blockid: > blk_3906931575379674856_2506 > 2009-06-02 12:24:31,415 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-4614683533809303164_2559 src: /192.168.0.191:57345 dest: > /192.168.0.175:50010 > 2009-06-02 12:24:38,554 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(192.168.0.175:50010, > storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075, > ipcPort=50020) Starting thread to transfer block > blk_-2411001974305504436_2550 to 192.168.0.179:50010, 192.168.0.196:50010 > 2009-06-02 12:24:38,560 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(192.168.0.175:50010, > storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075, > ipcPort=50020):Failed to transfer blk_-2411001974305504436_2550 to > 192.168.0.179:50010 got java.net.SocketException: Connection reset > at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) > at java.net.SocketOutputStream.write(SocketOutputStream.java:136) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:319) > at > > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400) > at > > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1108) > at java.lang.Thread.run(Thread.java:619) > > 2009-06-02 12:24:47,562 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-8510344733259320018_2551 file > /data/hadoop-datastore/current/blk_-8510344733259320018 > > 2009-06-02 12:24:47,562 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-8510344733259320018_2551 file > /data/hadoop-datastore/current/blk_-8510344733259320018 > 2009-06-02 12:24:47,562 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block > blk_-5221330074279244492_2551 file > /data/hadoop-datastore/current/blk_-5221330074279244492 > 2009-06-02 12:24:51,350 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls > recoverBlock(block=blk_-4614683533809303164_2559, > targets=[192.168.0.175:50010]) > 2009-06-02 12:24:51,353 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: > oldblock=blk_-4614683533809303164_2559(length=65804288), > newblock=blk_-4614683533809303164_2560(length=65804288), > datanode=192.168.0.175:50010 > 2009-06-02 12:25:04,002 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock > for block blk_-4614683533809303164_2559 > java.nio.channels.ClosedByInterruptException > 2009-06-02 12:25:04,003 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-4614683533809303164_2559 Interrupted. > 2009-06-02 12:25:04,003 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-4614683533809303164_2559 terminating > 2009-06-02 12:25:04,003 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_-4614683533809303164_2559 received exception > java.nio.channels.ClosedByInterruptException > 2009-06-02 12:25:04,003 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(192.168.0.175:50010, > storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075, > ipcPort=50020):DataXceiver > java.nio.channels.ClosedByInterruptException > at > > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:271) > at > > org.apache.hadoop.hdfs.server.datanode.FSDataset.getChannelPosition(FSDataset.java:1090) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.setBlockPosition(BlockReceiver.java:604) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:396) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) > at java.lang.Thread.run(Thread.java:619) > 2009-06-02 12:25:04,037 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-4614683533809303164_2560 src: /192.168.0.191:57348 dest: > /192.168.0.175:50010 > 2009-06-02 12:25:04,037 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block > for append blk_-4614683533809303164_2560 > 2009-06-02 12:25:22,924 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Changing block file offset > of block blk_-4614683533809303164_2560 from 0 to 65739264 meta file offset > to 513595 > 2009-06-02 12:25:22,928 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock > for block blk_-4614683533809303164_2560 java.net.SocketException: > Connection > reset > 2009-06-02 12:25:22,928 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-4614683533809303164_2560 Interrupted. > 2009-06-02 12:25:22,928 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block > blk_-4614683533809303164_2560 terminating > 2009-06-02 12:25:22,928 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_-4614683533809303164_2560 received exception java.net.SocketException: > Connection reset > 2009-06-02 12:25:22,928 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(192.168.0.175:50010, > storageID=DS-661194258-192.168.0.175-50010-1243890204046, infoPort=50075, > ipcPort=50020):DataXceiver > java.net.SocketException: Connection reset > at java.net.SocketInputStream.read(SocketInputStream.java:168) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at java.io.DataInputStream.read(DataInputStream.java:132) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:261) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:351) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372) > at > > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) > at > > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) > at java.lang.Thread.run(Thread.java:619) > 2009-06-02 12:32:20,289 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_-4146349943716507582_2556 > 2009-06-02 12:35:42,249 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_6405527929673442526_2329 > 2009-06-02 12:41:01,502 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_8099366148547328126_2340 > > ===================================================== > > > > Andrew Purtell-2 wrote: > > > > Thanks for the system info. CPU and RAM resources look good. > > > > You are not able to insert enough data into HBase to trigger splits (or > > they are delayed due to intense write activity), because the HDFS layer > > underneath is not able to keep up. If you have already tried the HDFS > > related suggestions on the HBase troubleshooting page > > (http://wiki.apache.org/hadoop/Hbase/Troubleshooting) and are still > having > > problems, adding additional HDFS data nodes to spread the load may help, > > but it depends where is the bottleneck. What network do you have > > connecting these boxes? Gigabit Ethernet? Fast (megabit) Ethernet? > > > > > Andrew Purtell-2 wrote: > > > > Something else to try here is forcing splits early in the upload process. > > You can use the hbase shell or the HBaseAdmin client API to signal the > > master to ask the regionserver(s) hosting a table to split it regardless > > of whether the stores have reached the split threshold or not. This type > > of split request is advisory, and will not happen if there is no data in > a > > region or if it has just split already and a compaction is pending. The > > strategy here is to actually use all of the nodes you have deployed, or, > > if you add more, to use those as well. Force enough splits so there is at > > least one region of the table hosted by a region server on every node. It > > could be something like: > > > > HTable table = ...; > > HBaseAdmin admin = ...; > > int nodes = getAvailableNodes(); > > while (table.getStartKeys().length < nodes) { > > // upload 100 rows > > // ... > > admin.split(table.getTableName()); > > // give some time for things to settle down > > Thread.sleep(30 * 1000); // or 60 * 1000 > > } > > // upload the rest > > // ... > > > > There is no HBase API to stand in for getAvailableNodes() -- yet... I > will > > make one now, that seems useful -- but if you have co-deployed mapreduce > > with your region servers, you could use JobClient.getClusterStatus() to > > programmatically determine the size of the cluster. See > > > http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobClient.html#getClusterStatus()<http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobClient.html#getClusterStatus%28%29> > > > > > I would like to try this, but I'm not running map/reduce yet. > -- > View this message in context: > http://www.nabble.com/HBase-looses-regions.-tp23657983p23839676.html > Sent from the HBase User mailing list archive at Nabble.com. > >
