Re: Can I config the 'dfs.data.dir' of the hadoop using Whirr?

Andrei Savu Thu, 11 Oct 2012 07:14:21 -0700

Please open an issue on JIRA at: https://issues.apache.org/jira/browse/WHIRR


Describe the steps needed to reproduce.

Thanks,

-- Andrei Savu / axemblr.com

On Thu, Oct 11, 2012 at 8:45 AM, 刘景琛 <[email protected]> wrote:

> Dear Andrei and everyone
>
> I think I have found the cause of my problem.
> I used an image of Ubuntu to build my hadoop cluster and the image is a
> 'ebs' image.
> The 'ebs' type caused this problem.
> If the image is changed to a 'instance store' type one, everything works
> fine.
> The 'Configured Capicity' became 3.91TB.
> Maybe the prepare_all_disk.sh script doesn't work well in an ebs instance.
>
> Best regards,
> Jingchen LIU
>
> 2012/10/11 刘景琛 <[email protected]>
>
>> Dear Andrei
>>
>> Thank you for reply! I still have a question:
>> I launched a cluster of 13 c1.medium instances, and access the Namenode
>> web UI (http://NAMENODE_PUBLIC_DNS:50070/dfshealth.jsp).
>> It shows that the 'Configured Capicity' is 82.49GB.
>> I logged into the master instance, run the command 'df -h', it shows the
>> size of /dev/sda1 is 7.9 GB and the size of /dev/sdb is 335GB.
>> If the /data0 is linked to /mnt, shouldn't the 'Configured Capicity' be
>> about 335*12 GB?
>>
>> Thank you very much!
>>
>> Best regards
>> Jingchen Liu
>>
>> 2012/10/10 Andrei Savu <[email protected]>
>>
>>> /data0 is just a symlink to /mnt
>>>
>>> You should have enough space. See the following file for details:
>>>
>>> https://github.com/apache/whirr/blob/trunk/services/hadoop/src/main/resources/functions/prepare_all_disks.sh
>>>
>>> -- Andrei Savu / axemblr.com
>>>
>>>
>>> On Wed, Oct 10, 2012 at 2:15 PM, 刘景琛 <[email protected]> wrote:
>>>
>>>> Dear Whirr Developers and Users
>>>>
>>>> I'm using Whirr to run a hadoop cluster on AWS EC2, and I met a problem.
>>>> After launching a hadoop cluster, I logged into the master instance
>>>> using putty, get to the path /usr/local/hadoop/conf,
>>>> and checked the hdfs-site.xml. I found that the 'dfs.data.dir' was set
>>>> to '/data0/hadoop/hdfs/data'.
>>>>
>>>> The /data0 folder is in the instance's /dev/sda1 device, which is very
>>>> small. If the hadoop store the HDFS data here, the space will not be enough
>>>> for my big data.
>>>> The EC2 instances have another device /dev/sdb, which is much bigger
>>>> and has been mounted to /mnt.
>>>> So I think the 'dfs.data.dir' should be set to a folder under /mnt (and
>>>> also some other properties such as 'hadoop.tmp.dir', 'mapred.local.dir'
>>>> etc.)
>>>> Could anyone tell me how to do this using Whirr?
>>>>
>>>> Thank you very much!
>>>>
>>>> Best regards,
>>>>
>>>> Jingchen LIU
>>>>
>>>
>>>
>>
>

Re: Can I config the 'dfs.data.dir' of the hadoop using Whirr?

Reply via email to