Re: Can I config the 'dfs.data.dir' of the hadoop using Whirr?

刘景琛 Thu, 11 Oct 2012 20:31:51 -0700

Dear Andrei

I created the issue: https://issues.apache.org/jira/browse/WHIRR-668


Best regards,
Jingchen Liu

2012/10/11 Andrei Savu <[email protected]>

> Please open an issue on JIRA at:
> https://issues.apache.org/jira/browse/WHIRR
>
> Describe the steps needed to reproduce.
>
> Thanks,
>
> -- Andrei Savu / axemblr.com
>
> On Thu, Oct 11, 2012 at 8:45 AM, 刘景琛 <[email protected]> wrote:
>
>> Dear Andrei and everyone
>>
>> I think I have found the cause of my problem.
>> I used an image of Ubuntu to build my hadoop cluster and the image is a
>> 'ebs' image.
>> The 'ebs' type caused this problem.
>> If the image is changed to a 'instance store' type one, everything works
>> fine.
>> The 'Configured Capicity' became 3.91TB.
>> Maybe the prepare_all_disk.sh script doesn't work well in an ebs instance.
>>
>> Best regards,
>> Jingchen LIU
>>
>> 2012/10/11 刘景琛 <[email protected]>
>>
>>> Dear Andrei
>>>
>>> Thank you for reply! I still have a question:
>>> I launched a cluster of 13 c1.medium instances, and access the Namenode
>>> web UI (http://NAMENODE_PUBLIC_DNS:50070/dfshealth.jsp).
>>> It shows that the 'Configured Capicity' is 82.49GB.
>>> I logged into the master instance, run the command 'df -h', it shows the
>>> size of /dev/sda1 is 7.9 GB and the size of /dev/sdb is 335GB.
>>> If the /data0 is linked to /mnt, shouldn't the 'Configured Capicity' be
>>> about 335*12 GB?
>>>
>>> Thank you very much!
>>>
>>> Best regards
>>> Jingchen Liu
>>>
>>> 2012/10/10 Andrei Savu <[email protected]>
>>>
>>>> /data0 is just a symlink to /mnt
>>>>
>>>> You should have enough space. See the following file for details:
>>>>
>>>> https://github.com/apache/whirr/blob/trunk/services/hadoop/src/main/resources/functions/prepare_all_disks.sh
>>>>
>>>> -- Andrei Savu / axemblr.com
>>>>
>>>>
>>>> On Wed, Oct 10, 2012 at 2:15 PM, 刘景琛 <[email protected]> wrote:
>>>>
>>>>> Dear Whirr Developers and Users
>>>>>
>>>>> I'm using Whirr to run a hadoop cluster on AWS EC2, and I met a
>>>>> problem.
>>>>> After launching a hadoop cluster, I logged into the master instance
>>>>> using putty, get to the path /usr/local/hadoop/conf,
>>>>> and checked the hdfs-site.xml. I found that the 'dfs.data.dir' was set
>>>>> to '/data0/hadoop/hdfs/data'.
>>>>>
>>>>> The /data0 folder is in the instance's /dev/sda1 device, which is very
>>>>> small. If the hadoop store the HDFS data here, the space will not be 
>>>>> enough
>>>>> for my big data.
>>>>> The EC2 instances have another device /dev/sdb, which is much bigger
>>>>> and has been mounted to /mnt.
>>>>> So I think the 'dfs.data.dir' should be set to a folder under /mnt
>>>>> (and also some other properties such as 'hadoop.tmp.dir',
>>>>> 'mapred.local.dir' etc.)
>>>>> Could anyone tell me how to do this using Whirr?
>>>>>
>>>>> Thank you very much!
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jingchen LIU
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Can I config the 'dfs.data.dir' of the hadoop using Whirr?

Reply via email to