Re: how to handle the corrupt block in HDFS?

2013-12-11 Thread Adam Kawa
I have only 1-node cluster, so I am not able to verify it when replication factor is bigger than 1. I run the fsck on a file that consists of 3 blocks, and 1 block has a corrupt replica. fsck told that the system is HEALTHY. When I restarted the DN, then the block scanner (BlockPoolSliceScanner)

Re: how to handle the corrupt block in HDFS?

2013-12-11 Thread ch huang
the alert from my product env,i will test on my benchmark env,thanks On Thu, Dec 12, 2013 at 2:33 AM, Adam Kawa kawa.a...@gmail.com wrote: I have only 1-node cluster, so I am not able to verify it when replication factor is bigger than 1. I run the fsck on a file that consists of 3 blocks,

Re: how to handle the corrupt block in HDFS?

2013-12-11 Thread ch huang
and is fsck report data from BlockPoolSliceScanner? it seems run once each 3 weeks can i restart DN one by one without interrupt the job which is running? On Thu, Dec 12, 2013 at 2:33 AM, Adam Kawa kawa.a...@gmail.com wrote: I have only 1-node cluster, so I am not able to verify it when

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread shashwat shriparv
How many nodes you have? and if fsck is giving you healthy status no need to worry. with the replication 10 what i may conclude that you have 10 listed datanodes so 10 replicated jar files for the job to run. *Thanks Regards* ∞ Shashwat Shriparv On Tue, Dec 10, 2013 at 3:50 PM,

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread Patai Sangbutsarakum
10 copies for those job.jar and split are controlled by mapred.submit.replication property at job init level. On Mon, Dec 9, 2013 at 5:20 PM, ch huang justlo...@gmail.com wrote: more strange , in my HDFS cluster ,every block has three replicas,but i find some one has ten replicas ,why? #

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread ch huang
By default this higher replication level is 10. is this value can be control via some option or variable? i only hive a 5-worknode cluster,and i think 5 replicas should be better,because every node can get a local replica. another question is ,why hdfs fsck check the cluster is healthy and no

RE: how to handle the corrupt block in HDFS?

2013-12-10 Thread Vinayakumar B
Hi ch huang, It may seem strange, but the fact is, CorruptBlocks through JMX or http://NNIP:50070/jmxhttp://nnip:50070/jmx means Number of blocks with corrupt replicas. May not be all replicas are corrupt. This you can check through jconsole for description. Where as Corrupt blocks through

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread ch huang
thanks for reply, what i do not know is how can i locate the block which has the corrupt replica,(so i can observe how long the corrupt replica will be removed and a new health replica replace it,because i get nagios alert for three days,i do not sure if it is the same corrupt replica cause the

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread Adam Kawa
Maybe this can work for you $ sudo -u hdfs hdfs fsck / -list-corruptfileblocks ? 2013/12/11 ch huang justlo...@gmail.com thanks for reply, what i do not know is how can i locate the block which has the corrupt replica,(so i can observe how long the corrupt replica will be removed and a new

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread Adam Kawa
When you identify a file with corrupt block(s), then you can locate the machines that stores its block by typing $ sudo -u hdfs hdfs fsck path-to-file -files -blocks -locations 2013/12/11 Adam Kawa kawa.a...@gmail.com Maybe this can work for you $ sudo -u hdfs hdfs fsck /

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread ch huang
thanks for reply,but if the block just has 1 corrupt replica,hdfs fsck can not tell you which block of which file has a replica been corrupted,fsck just useful on all of one block's replica bad On Wed, Dec 11, 2013 at 10:01 AM, Adam Kawa kawa.a...@gmail.com wrote: When you identify a file with

Re: how to handle the corrupt block in HDFS?

2013-12-09 Thread ch huang
the strange thing is when i use the following command i find 1 corrupt block # curl -s http://ch11:50070/jmx |grep orrupt CorruptBlocks : 1, but when i run hdfs fsck / , i get none ,everything seems fine # sudo -u hdfs hdfs fsck / Status:

Re: how to handle the corrupt block in HDFS?

2013-12-09 Thread ch huang
more strange , in my HDFS cluster ,every block has three replicas,but i find some one has ten replicas ,why? # sudo -u hdfs hadoop fs -ls /data/hisstage/helen/.staging/job_1385542328307_0915 Found 5 items -rw-r--r-- 3 helen hadoop 7 2013-11-29 14:01