Try running fsck On Wed, Jun 24, 2015 at 2:54 PM, Ja Sam <[email protected]> wrote:
> I had a running Hadoop cluster (version 2.2.0.2.0.6.0-76 from > Hortonworks). Yesterday a lot of things happened nad in some point of time > we decided to one by one reboot all datanodes. Unfortunate the operator did > monitor the namenode health monitor. > > The result of above operation is that all datanodes shows as dead nodes, > all blocked are lost, ... . > > In one datanode which we decided to reboot it once again to see if > datanode will log anything interesting. The log finished with informations: > > INFO ipc.Server (Server.java:run(861)) - IPC Server Responder: starting > INFO ipc.Server (Server.java:run(688)) - IPC Server listener on 8010: > starting > > and hangs here. In the same time on namnode I can see only two types of > messages: > > INFO hdfs.StateChange (FSNamesystem.java:completeFile(2805)) - DIR* > completeFile: [SOME PATH] is closed by DFSClient_NONMAPREDUCE_288661168_33 > > and a lot of: > > WARN blockmanagement.BlockManager > (PendingReplicationBlocks.java:pendingReplicationCheck(249)) - > PendingReplicationMonitor timed out blk_1074405820_668233 > > Today we decided to restart name node and all data nodes. After restart > website: http://[server]:50070/dfshealth.jspanswers VERY slow. I don't > see any errors in log except 5 like bellow: > > ERROR datanode.DataNode (DataXceiver.java:run(225)) - > maelhd21:50010:DataXceiver error processing WRITE_BLOCK operation src: > /node1:33470 dest: /node3:50010 > > org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: > Block BP-1037132819-192.168.61.196-1409328081083:blk_1075994366_2257020 > already exists in state FINALIZED and thus cannot be created. > > 3 out of 5 nodes shows as lived, but refresh of hadoop status page takes > more than 10 minutes. > > The question of course is: what should I check or do now? > > > p.s. I asked same question on StackOverflow: > http://stackoverflow.com/questions/31020877/datanodes-are-cannot-connect-to-namenode >
