Thank you for your reply. In that case, I wonder why it does not work as I expected on our Hadoop. As far as I have tested, they did not work with TestDFSIO as you explained. I'll check if they work with dfs -put command on our Hadoop instance next week.
On Fri, May 21, 2010 at 12:37 AM, Koji Noguchi <[email protected]> wrote: > Kiyoshi, > > Block size is set by the client, so no need to restart nor format nor > changing the configs. > > $ ls -l testfile.txt > -rw-r--r-- 1 knoguchi users 202145 May 1 2009 testfile.txt > $ hadoop dfs -put testfile.txt /user/knoguchi/testfile.txt > $ hadoop dfs -Ddfs.block.size=10240 -put testfile.txt > /user/knoguchi/testfile2.txt > $ hadoop fsck /user/knoguchi/testfile.txt | grep "Total blocks" > Total blocks (validated): 1 (avg. block size 202145 B) > $ hadoop fsck /user/knoguchi/testfile2.txt | grep "Total blocks" > Total blocks (validated): 20 (avg. block size 10107 B) > $ > > Koji > > > On 5/20/10 7:56 AM, "Kiyoshi Mizumaru" <[email protected]> wrote: > >> Unfortunately it does not work as I expected. >> >> Cleaned up previous Hadoop instance data by removing all the >> files/directories >> which exist in dfs.name.dir and dfs.data.dir, and formatted new HDFS with >> hadoop namenode -format gave me a new Hadoop instance as I expected. >> >> It seems that changing configuration files and formatting HDFS (and >> restarting >> all daemons, of course) are not enough to change replication and block size, >> is it correct? >> >> >> On Wed, May 19, 2010 at 2:14 PM, Kiyoshi Mizumaru >> <[email protected]> wrote: >>> Hi Koji, >>> >>> Thank you for your reply. >>> I'll try what you wrote and see if it works as expected. >>> >>> By the way, what does the `client-side config' mean? >>> dfs.replication and dfs.block.size are written in conf/hdfs-site.xml. >>> Where should I put them into? >>> >>> >>> On Tue, May 18, 2010 at 3:01 AM, Koji Noguchi <[email protected]> >>> wrote: >>>> Hi Kiyoshi, >>>> >>>> In case you haven't received a reply, try >>>> >>>> hadoop jar hadoop-*-test.jar TestDFSIO -Ddfs.block.size=536870912 -D >>>> dfs.replication=1 .... >>>> >>>> If that works, add them as part of your client-side config. >>>> >>>> Koji >>>> >>>> >>>> On 5/13/10 11:38 PM, "Kiyoshi Mizumaru" <[email protected]> wrote: >>>> >>>>> Hi all, this is my first post to this list, and if i'm not in >>>>> appropriate place, please let me know. >>>>> >>>>> >>>>> I have just created a Hadoop instance and its HDFS is configured as: >>>>> dfs.replication = 1 >>>>> dfs.block.size = 536870912 (512MB) >>>>> >>>>> Then I typed the following command to run TestDFSIO against this instance: >>>>> % hadoop jar hadoop-*-test.jar TestDFSIO -write -nrFiles 1 -fileSize >>>>> 1024 >>>>> >>>>> One file with 1024MB size should consist of 2 blocks of size 512MB, >>>>> but filesystem browser shows that /benchmarks/TestDFSIO/io_data/test_io_0 >>>>> consists of 16 blocks of size 64MB, and its replication is 3, so 48 blocks >>>>> are displayed in total.. >>>>> >>>>> This is not what I expected, does anyone know what's wrong? >>>>> >>>>> I'm using Cloudera's Distribution for Hadoop (hadoop-0.20-0.20.2+228-1) >>>>> with Sun Java6 (jdk-6u19-linux-amd64). Thanks in advance and sorry for >>>>> my poor English, I'm still leaning it. >>>>> -- >>>>> Kiyoshi >>>> >>>> >>> > >
