If 64 mb is your hdfs block size then the 50 mb file won't be splitted, would be stored in a single block in hdfs. AFAIK which data node or rather which all data nodes is decided by the name node. The block would be replicated and stored, in default the replication factor is 3. So in your case the block should be there in both of your data nodes. When you use map reduce to process the file the number of mappers is decided by the number of input splits. In most clusters the mapreduce input split size would be same as hdfs block size. So in your case just one mapper would be triggered. Regards Bejoy K S
-----Original Message----- From: 程笑 <xiaocheng...@163.com> Date: Wed, 10 Aug 2011 16:33:18 To: <hdfs-user@hadoop.apache.org> Reply-To: hdfs-user@hadoop.apache.org Subject: The problem about block size Hi, I have established a Hadoop cluster with one NameNode and two DataNodes. Now I have a question about block size. I site the block size for 64MB.I stor one text file (50MB) on the HDFS. Whether this text file is splited? If not, the text file stor on which DataNodes? I user MapReduce to comput the text file, how many maps will process this file? Please forgive my poor English. Best regards! Cheng Xiao