Try running fsck

On Wed, Jun 24, 2015 at 2:54 PM, Ja Sam <[email protected]> wrote:

> I had a running Hadoop cluster (version 2.2.0.2.0.6.0-76 from
> Hortonworks). Yesterday a lot of things happened nad in some point of time
> we decided to one by one reboot all datanodes. Unfortunate the operator did
> monitor the namenode health monitor.
>
> The result of above operation is that all datanodes shows as dead nodes,
> all blocked are lost, ... .
>
> In one datanode which we decided to reboot it once again to see if
> datanode will log anything interesting. The log finished with informations:
>
> INFO  ipc.Server (Server.java:run(861)) - IPC Server Responder: starting
> INFO  ipc.Server (Server.java:run(688)) - IPC Server listener on 8010: 
> starting
>
> and hangs here. In the same time on namnode I can see only two types of
> messages:
>
> INFO  hdfs.StateChange (FSNamesystem.java:completeFile(2805)) - DIR* 
> completeFile: [SOME PATH] is closed by DFSClient_NONMAPREDUCE_288661168_33
>
> and a lot of:
>
> WARN  blockmanagement.BlockManager 
> (PendingReplicationBlocks.java:pendingReplicationCheck(249)) - 
> PendingReplicationMonitor timed out blk_1074405820_668233
>
> Today we decided to restart name node and all data nodes. After restart
> website: http://[server]:50070/dfshealth.jspanswers VERY slow. I don't
> see any errors in log except 5 like bellow:
>
>  ERROR datanode.DataNode (DataXceiver.java:run(225)) - 
> maelhd21:50010:DataXceiver error processing WRITE_BLOCK operation  src: 
> /node1:33470 dest: /node3:50010
>
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException:
> Block BP-1037132819-192.168.61.196-1409328081083:blk_1075994366_2257020
> already exists in state FINALIZED and thus cannot be created.
>
> 3 out of 5 nodes shows as lived, but refresh of hadoop status page takes
> more than 10 minutes.
>
> The question of course is: what should I check or do now?
>
>
> p.s. I asked same question on StackOverflow:
> http://stackoverflow.com/questions/31020877/datanodes-are-cannot-connect-to-namenode
>

Reply via email to