Thank you for your reply, I am novice, I do not quite understand how to check dfs.data.dir more than eight minutes? My settings are dfs.data.dir <property> <name> dfs.data.dir </ name> <value> / b, / c, / d, / e, / f, / g, / h, / i, / j, / k, / l </ value> </ property>
Each directory has a disk mount 2009/7/17 jason hadoop <[email protected]> > Most likely problem is that the block report is taking more than 10 > minutes. > Due to the placement of the sync blocks in the core Datanode code, the > block > report locks out the heartbeat. > This can cause the namenode to think the datanode has vanished. > > A simple way to check, is to run a find on the directory set specified for > dfs.data.dir. If this find takes more than 8 minutes or so, you are in > trouble. > The only solutions are to add more datanodes, reducing the block count, or > increase your system io speed so that the block report may complete in > time. > > On Fri, Jul 17, 2009 at 6:12 AM, mingyang <[email protected]> wrote: > > > i using hadoop storage my media files,, but when the number of documents > > when more than one million, > > Hadoop start about 10-20 minutes, my datanode automatically down, > > namenode log shows that the loss of heart, but I see my normal datanode, > > port 50010 can be a normal telnet, use jps to see can see datanode still > > running, but at this time have been unable to put data to a hadoop, I > guess > > datanode services is dead, hadoop does not support more than one million > > documents? How do I adjust those parameters? I have already set up at the > > same time the number of open file 65535 > > > > > > namenode log > > > > 2009-07-17 18:14:29,330 INFO > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > > > > ugi=root,root,bin,daemon,sys,adm,disk,wheelip=/192.168.1.96 > > cmd=setPermission > > > > src=/hadoop/tmp/mapred/system/jobtracker.info dst=null > > perm=root:supergroup > > :rw------- > > 2009-07-17 18:14:29,336 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > > NameSystem.allocateBlock: > > > > /hadoop/tmp/mapred/system/jobtrack > > er.info. blk_-2148480138731090754_1403179 > > 2009-07-17 18:14:32,958 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > > NameSystem.addStoredBlock: blockMap updated: > > > > 192.168.1.97:50 > > 010 is added to blk_-2148480138731090754_1403179 size 4 > > 2009-07-17 18:14:33,340 INFO org.apache.hadoop.hdfs.StateChange: DIR* > > NameSystem.completeFile: file > > > > /hadoop/tmp/mapred/system/jobtra > > cker.info is closed by DFSClient_1037557306 > > 2009-07-17 18:16:21,349 INFO > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from > > 192.168.1.96 > > 2009-07-17 18:16:21,349 INFO > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of > > transactions: > > 7 Total time for > > > > transacti > > ons(ms): 1Number of transactions batched in Syncs: 1 Number of syncs: 6 > > SyncTimes(ms): 9 > > 2009-07-17 18:17:12,171 INFO > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll FSImage from > > 192.168.1.96 > > 2009-07-17 18:17:12,171 INFO > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of > > transactions: > > 0 Total time for > > > > transacti > > ons(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 1 > > SyncTimes(ms): 0 > > 2009-07-17 18:51:00,566 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > > NameSystem.heartbeatCheck: lost heartbeat from > > > > 192.168.1.97: > > 50010 > > 2009-07-17 18:51:25,383 INFO org.apache.hadoop.net.NetworkTopology: > > Removing > > a node: /default-rack/192.168.1.97:50010 > > 2009-07-17 19:10:48,564 INFO > > org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease. > Holder: > > DFSClient_- > > > > 1624377199, pend > > ingcreates: 69] has expired hard limit > > 2009-07-17 19:10:48,564 INFO > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering > > lease=[Lease. Holder: > > > > DFSClient_-16243 > > 77199, pendingcreates: 69], > > src=/unp/01/video/B3/94/{B394EDB2-0302-34B9-5357-4904FFFEFF36}_100.unp > > > > datanode log > > > > 2009-07-17 18:52:40,719 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_664105388033641 > > 1514_601647 > > 2009-07-17 18:52:12,421 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_-10747966535898 > > 19594_1392025 > > 2009-07-17 18:51:44,074 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_-63504301593802 > > 31402_155334 > > 2009-07-17 18:51:12,760 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_460729998775184 > > 5359_395290 > > 2009-07-17 18:50:39,977 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_802918354954113 > > 9011_474989 > > 2009-07-17 18:50:11,707 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_846865664811904 > > 9754_1065465 > > 2009-07-17 18:49:39,421 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_473953565994615 > > 8302_532204 > > 2009-07-17 18:49:11,213 INFO > > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > > succeeded for blk_-14950858387931 > > 09024_354553 > > > > 09/07/17 18:02:09 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor > > exception for block > > blk_-2536746364442878375_1403164java.net.SocketTimeoutException: 63000 > > millis timeout while waiting for channel to be ready for read. ch : > > java.nio.channels.SocketChannel[connected local=/192.168.1.94:54783 > remote=/ > > 192.168.1.97:50010] > > 11096473 09/07/17 18:02:12 INFO hdfs.DFSClient: Exception in > > createBlockOutputStream java.net.SocketTimeoutException: 63000 millis > > timeout while waiting for channel to be ready for read. ch : > > java.nio.channels.SocketChannel[connected local=/192.168.1.94:54790 > remote=/ > > 192.168.1.97:50010] > > 11096475 09/07/17 18:02:12 INFO hdfs.DFSClient: Exception in > > createBlockOutputStream java.net.SocketTimeoutException: 63000 millis > > timeout while waiting for channel to be ready for read. ch : > > java.nio.channels.SocketChannel[connected local=/192.168.1.94:54791 > remote=/ > > 192.168.1.97:50010] > > > > > > -- > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > http://www.amazon.com/dp/1430219424?tag=jewlerymall > www.prohadoopbook.com a community for Hadoop Professionals > -- 致 礼! 王明阳
