Kiyoshi, Block size is set by the client, so no need to restart nor format nor changing the configs.
$ ls -l testfile.txt -rw-r--r-- 1 knoguchi users 202145 May 1 2009 testfile.txt $ hadoop dfs -put testfile.txt /user/knoguchi/testfile.txt $ hadoop dfs -Ddfs.block.size=10240 -put testfile.txt /user/knoguchi/testfile2.txt $ hadoop fsck /user/knoguchi/testfile.txt | grep "Total blocks" Total blocks (validated): 1 (avg. block size 202145 B) $ hadoop fsck /user/knoguchi/testfile2.txt | grep "Total blocks" Total blocks (validated): 20 (avg. block size 10107 B) $ Koji On 5/20/10 7:56 AM, "Kiyoshi Mizumaru" <[email protected]> wrote: > Unfortunately it does not work as I expected. > > Cleaned up previous Hadoop instance data by removing all the files/directories > which exist in dfs.name.dir and dfs.data.dir, and formatted new HDFS with > hadoop namenode -format gave me a new Hadoop instance as I expected. > > It seems that changing configuration files and formatting HDFS (and restarting > all daemons, of course) are not enough to change replication and block size, > is it correct? > > > On Wed, May 19, 2010 at 2:14 PM, Kiyoshi Mizumaru > <[email protected]> wrote: >> Hi Koji, >> >> Thank you for your reply. >> I'll try what you wrote and see if it works as expected. >> >> By the way, what does the `client-side config' mean? >> dfs.replication and dfs.block.size are written in conf/hdfs-site.xml. >> Where should I put them into? >> >> >> On Tue, May 18, 2010 at 3:01 AM, Koji Noguchi <[email protected]> wrote: >>> Hi Kiyoshi, >>> >>> In case you haven't received a reply, try >>> >>> hadoop jar hadoop-*-test.jar TestDFSIO -Ddfs.block.size=536870912 -D >>> dfs.replication=1 .... >>> >>> If that works, add them as part of your client-side config. >>> >>> Koji >>> >>> >>> On 5/13/10 11:38 PM, "Kiyoshi Mizumaru" <[email protected]> wrote: >>> >>>> Hi all, this is my first post to this list, and if i'm not in >>>> appropriate place, please let me know. >>>> >>>> >>>> I have just created a Hadoop instance and its HDFS is configured as: >>>> dfs.replication = 1 >>>> dfs.block.size = 536870912 (512MB) >>>> >>>> Then I typed the following command to run TestDFSIO against this instance: >>>> % hadoop jar hadoop-*-test.jar TestDFSIO -write -nrFiles 1 -fileSize 1024 >>>> >>>> One file with 1024MB size should consist of 2 blocks of size 512MB, >>>> but filesystem browser shows that /benchmarks/TestDFSIO/io_data/test_io_0 >>>> consists of 16 blocks of size 64MB, and its replication is 3, so 48 blocks >>>> are displayed in total.. >>>> >>>> This is not what I expected, does anyone know what's wrong? >>>> >>>> I'm using Cloudera's Distribution for Hadoop (hadoop-0.20-0.20.2+228-1) >>>> with Sun Java6 (jdk-6u19-linux-amd64). Thanks in advance and sorry for >>>> my poor English, I'm still leaning it. >>>> -- >>>> Kiyoshi >>> >>> >>
