Hi

As I understand there is property for dn's -
dfs.datanode.scan.period.hours and it is by default 504 hours. So in case I modifi it to 1 hour it means if I delete block from one server then after one hour nn get info that block is missing and will restore it?

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 480

On 08/01/15 10:52, Margus Roo wrote:
Hi

I have simple HDFS setup: 1nn and 2dn.

I created file and added it into HDFS.
About file:
-bash-4.1$ hdfs fsck -blocks -locations -files /user/margusja/file2.txt
Connecting to namenode via http://nn:50070
FSCK started by hdfs (auth:SIMPLE) from /10.101.9.122 for path /user/margusja/file2.txt at Thu Jan 08 10:34:13 EET 2015
/user/margusja/file2.txt 409600000 bytes, 4 block(s):  OK
0. BP-808850907-10.101.21.132-1420641040354:blk_1073741828_1004 len=134217728 repl=2 [10.87.13.166:50010, 10.85.145.228:50010] 1. BP-808850907-10.101.21.132-1420641040354:blk_1073741829_1005 len=134217728 repl=2 [10.87.13.166:50010, 10.85.145.228:50010] 2. BP-808850907-10.101.21.132-1420641040354:blk_1073741830_1006 len=134217728 repl=2 [10.87.13.166:50010, 10.85.145.228:50010] 3. BP-808850907-10.101.21.132-1420641040354:blk_1073741831_1007 len=6946816 repl=2 [10.87.13.166:50010, 10.85.145.228:50010]

Status: HEALTHY
 Total size:    409600000 B
 Total dirs:    0
 Total files:   1
 Total symlinks:                0
 Total blocks (validated):      4 (avg. block size 102400000 B)
 Minimally replicated blocks:   4 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          2
 Number of racks:               1
FSCK ended at Thu Jan 08 10:34:13 EET 2015 in 1 milliseconds


The filesystem under path '/user/margusja/file2.txt' is HEALTHY

Now I went into one datanode and just deleted blk_1073741828 and got into dn's log: 2015-01-08 10:02:00,994 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removed block 1073741828 from memory with missing block file on the disk 2015-01-08 10:02:00,994 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Deleted a metadata file for the deleted block /grid/hadoop/hdfs/dn/current/BP-808850907-10.101.21.132-1420641040354/current/finalized/blk_1073741828_1004.meta

But still hdfs gives me that HDFS is healthy.

I can download the file from HDFS using hdfs dfs -get /user/margusja/file2.txt - there are some warnings that block is missing.

Now I went into second dn and deleted blk_1073741828.

Still hdfs fsck in nn gives me information that HDFS is OK.

Of course now I can't get my file anymore using hdfs dfs -get /user/margusja/file2.txt because blk_1073741828 is does not exist in dn1 and da2. But still nn is happy and thinks that HDFS is ok.

I guess I am testing it in wrong way.
Is there best practices how to test HDFS before going live? Steps like if somehow one block will be missing or corrupted?




Reply via email to