Thank you for your reply.

In that case, I wonder why it does not work as I expected on our Hadoop.
As far as I have tested, they did not work with TestDFSIO as you explained.
I'll check if they work with dfs -put command on our Hadoop instance next week.


On Fri, May 21, 2010 at 12:37 AM, Koji Noguchi <[email protected]> wrote:
> Kiyoshi,
>
> Block size is set by the client, so no need to restart nor format nor
> changing the configs.
>
> $ ls -l testfile.txt
> -rw-r--r-- 1 knoguchi users 202145 May  1  2009 testfile.txt
> $ hadoop dfs -put testfile.txt /user/knoguchi/testfile.txt
> $ hadoop dfs -Ddfs.block.size=10240 -put testfile.txt
> /user/knoguchi/testfile2.txt
> $ hadoop fsck /user/knoguchi/testfile.txt | grep "Total blocks"
>  Total blocks (validated):      1 (avg. block size 202145 B)
> $ hadoop fsck /user/knoguchi/testfile2.txt | grep "Total blocks"
>  Total blocks (validated):      20 (avg. block size 10107 B)
> $
>
> Koji
>
>
> On 5/20/10 7:56 AM, "Kiyoshi Mizumaru" <[email protected]> wrote:
>
>> Unfortunately it does not work as I expected.
>>
>> Cleaned up previous Hadoop instance data by removing all the 
>> files/directories
>> which exist in dfs.name.dir and dfs.data.dir, and formatted new HDFS with
>> hadoop namenode -format gave me a new Hadoop instance as I expected.
>>
>> It seems that changing configuration files and formatting HDFS (and 
>> restarting
>> all daemons, of course) are not enough to change replication and block size,
>> is it correct?
>>
>>
>> On Wed, May 19, 2010 at 2:14 PM, Kiyoshi Mizumaru
>> <[email protected]> wrote:
>>> Hi Koji,
>>>
>>> Thank you for your reply.
>>> I'll try what you wrote and see if it works as expected.
>>>
>>> By the way, what does the `client-side config' mean?
>>> dfs.replication and dfs.block.size are written in conf/hdfs-site.xml.
>>> Where should I put them into?
>>>
>>>
>>> On Tue, May 18, 2010 at 3:01 AM, Koji Noguchi <[email protected]> 
>>> wrote:
>>>> Hi Kiyoshi,
>>>>
>>>> In case you haven't received a reply, try
>>>>
>>>> hadoop jar hadoop-*-test.jar TestDFSIO -Ddfs.block.size=536870912 -D
>>>> dfs.replication=1 ....
>>>>
>>>> If that works, add them as part of your client-side config.
>>>>
>>>> Koji
>>>>
>>>>
>>>> On 5/13/10 11:38 PM, "Kiyoshi Mizumaru" <[email protected]> wrote:
>>>>
>>>>> Hi all, this is my first post to this list, and if i'm not in
>>>>> appropriate place, please let me know.
>>>>>
>>>>>
>>>>> I have just created a Hadoop instance and its HDFS is configured as:
>>>>>   dfs.replication = 1
>>>>>   dfs.block.size = 536870912 (512MB)
>>>>>
>>>>> Then I typed the following command to run TestDFSIO against this instance:
>>>>>   % hadoop jar hadoop-*-test.jar TestDFSIO -write -nrFiles 1 -fileSize 
>>>>> 1024
>>>>>
>>>>> One file with 1024MB size should consist of 2 blocks of size 512MB,
>>>>> but filesystem browser shows that /benchmarks/TestDFSIO/io_data/test_io_0
>>>>> consists of 16 blocks of size 64MB, and its replication is 3, so 48 blocks
>>>>> are displayed in total..
>>>>>
>>>>> This is not what I expected, does anyone know what's wrong?
>>>>>
>>>>> I'm using Cloudera's Distribution for Hadoop (hadoop-0.20-0.20.2+228-1)
>>>>> with Sun Java6 (jdk-6u19-linux-amd64).  Thanks in advance and sorry for
>>>>> my poor English, I'm still leaning it.
>>>>> --
>>>>> Kiyoshi
>>>>
>>>>
>>>
>
>

Reply via email to