Keep in mind that the VM is not really designed for running real data in production at scale. Why on earth would you need that many disks on a VM?
 
Thanks,
James
 
17.11.2017, 00:02, "Aaron Harris" <[email protected]>:

Yes I did mean to add in more disks to your VM, but if you already have free space on the VM then there are a couple of parameters to look at.

 

Firstly the dfs.datanode.data.dir parameter I mentioned below, this controls where HDFS stores its data on the filesystem. How are the disks allocated and mounted on your VM? is it a single large disk or do you have them split.

 

On my test datanode VM I have an OS disk and 4 data disks configured as below;

 

/                              - 50Gb OS disk

/data_vg1/         - 200Gb Datadisk1

/data_vg2/         - 200Gb Datadisk2

/data_vg3/         - 200Gb Datadisk3

/data_vg4/         - 200Gb Datadisk4

 

In my example I have dfs.datanode.data.dir set to “/data_vg1,/data_vg2/,/data_vg3/,/data_vg4” to allow HDFS to allow the data node to write to all 4 datadisks but not to the OS disk.

 

Another parameter that will control how much of the disk HDFS can use is dfs.datanode.du.reserved, this ensures HDFS leaves this many bytes free on each volume for non-HDFS usage. In the above example where HDFS is using datadisks and there are no other processes (eg. Kafka, namenode) writing to these disks I can set this parameter quite low, to allow HDFS as much of the disk as possible.

 

If you are still having problems, could you send the output of “df -h” from your VM and what your dfs.datanode.data.dir and dfs.datanode.du.reserved settings are configured as.

 

 

Hope this helps.

 


Regards,

Aaron

 

From: [email protected] [mailto:[email protected]]
Sent: 16 November 2017 15:00
To: [email protected]
Subject: Re: HDFS SIze

 

I think he means to allocate more disk space to your VM.  Assuming the VM has space, you can follow Aaron's steps to expand the HDFS capacity.

 

On Thu, Nov 16, 2017 at 2:30 AM Syed Hammad Tahir <[email protected]> wrote:

Sorry for this noobish question. I didnt understand "If you can add more disks to your node then do so".  You mean add physical drive in the machine? I already have plenty of free space. Just dont know how to expand hdfs over it

 

 

On Thu, Nov 16, 2017 at 12:03 PM, Aaron Harris <[email protected]> wrote:

Syed,
 

Check what you have set for the dfs.datanode.data.dir parameter in HDFS config. 

If you can add more disks to your node then do so and update the above parameter so it references them.

Your other option is to add a complete new node, then install the datanode service on it through Ambari.
 

Regards,

Aaron


From: Syed Hammad Tahir <[email protected]>
Sent: Thursday, November 16, 2017 5:47:49 AM
To: [email protected]
Subject: HDFS SIze

 

HI,

I there anyway I could alot more space to hdfs? I am redeploying single node based ambari Metron cluster

Regards.

 

--

Jon

 
 
------------------- 
Thank you,
 
James Sirota
PMC- Apache Metron
jsirota AT apache DOT org
 

Reply via email to