Hi, With dfs.replication = 3, each block making up the file on HDFS, will have three copies on three distinct data nodes. Hence at any point you can loose two data nodes, the name node would know the address of the third block, no problem. In this scenario HDFS will replicate the block again to ensure the dfs.replication factor holds true. The replication pipeline describe the process where blocks are replicated to comply with dfs.replication factor.
Hadoop fs -ls -l /hdfs_file Will show you the replication factor of file in HDFS. You can view the details of the blocks making up HDFS file with: hadoop fsck / -files -blocks |grep hdfs_file_name -A 30 Hope this help Regards <div>-------- Original message --------</div><div>From: [email protected] </div><div>Date:18/12/2014 15:40 (GMT+02:00) </div><div>To: user <[email protected]> </div><div>Subject: A quick question about replication factor </div><div> </div>Hi Hadoopers, If I configure the replication factor to be 3 in the configuration file, then how many blocks of the same have been stored? Three or Four. [email protected]
