dfs.datanode.data.dir = /hadoop/hdfs/data,/hdfs/data
Data node 1:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 50G 12G 39G 23% /
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 16G 1.4G 15G 9% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sda2 494M 123M 372M 25% /boot
/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
data node 2:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 50G 24G 27G 48% /
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 24K 16G 1% /dev/shm
tmpfs 16G 97M 16G 1% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sda2 494M 124M 370M 26% /boot
/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: iain wright
Sent: Thursday, November 05, 2015 7:56 PM
To: [email protected]
Subject: Re: hadoop not using whole disk for HDFS
Please post:
- output of df -h from every datanode in your cluster
- what dfs.datanode.data.dir is currently set too
--
Iain Wright
This email message is confidential, intended only for the recipient(s) named
above and may contain information that is privileged, exempt from disclosure
under applicable law. If you are not the intended recipient, do not disclose or
disseminate the message to anyone except the intended recipient. If you have
received this message in error, or are not the named recipient(s), please
immediately notify the sender by return email, and delete all copies of this
message.
On Thu, Nov 5, 2015 at 5:24 PM, Adaryl "Bob" Wakefield, MBA
<[email protected]> wrote:
Is there a maximum amount of disk space that HDFS will use? Is 100GB that
max? When we’re supposed to be dealing with “big data” why is the amount of
data to be held on any one box such a small number when you’ve got terabytes
available?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Adaryl "Bob" Wakefield, MBA
Sent: Wednesday, November 04, 2015 4:38 PM
To: [email protected]
Subject: Re: hadoop not using whole disk for HDFS
This is an experimental cluster and there isn’t anything I can’t lose. I ran
into some issues. I’m running the Hortonworks distro and am managing things
through Ambari.
1. I wasn’t able to set the config to /home/hdfs/data. I got an error that
told me I’m not allowed to set that config to the /home directory. So I made it
/hdfs/data.
2. When I restarted, the space available increased by a whopping 100GB.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Naganarasimha G R (Naga)
Sent: Wednesday, November 04, 2015 4:26 PM
To: [email protected]
Subject: RE: hadoop not using whole disk for HDFS
Better would be to stop the daemons and copy the data from /hadoop/hdfs/data
to /home/hdfs/data , reconfigure dfs.datanode.data.dir to /home/hdfs/data and
then start the daemons. If the data is comparitively less !
Ensure you have the backup if have any critical data !
Regards,
+ Naga
------------------------------------------------------------------------------
From: Adaryl "Bob" Wakefield, MBA [[email protected]]
Sent: Thursday, November 05, 2015 03:40
To: [email protected]
Subject: Re: hadoop not using whole disk for HDFS
So like I can just create a new folder in the home directory like:
home/hdfs/data
and then set dfs.datanode.data.dir to:
/hadoop/hdfs/data,home/hdfs/data
Restart the node and that should do it correct?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Naganarasimha G R (Naga)
Sent: Wednesday, November 04, 2015 3:59 PM
To: [email protected]
Subject: RE: hadoop not using whole disk for HDFS
Hi Bob,
Seems like you have configured to disk dir to be other than an folder in
/home, if so try creating another folder and add to "dfs.datanode.data.dir"
seperated by comma instead of trying to reset the default.
And its also advised not to use the root partition "/" to be configured for
HDFS data dir, if the Dir usage hits the maximum then OS might fail to function
properly.
Regards,
+ Naga
------------------------------------------------------------------------------
From: P lva [[email protected]]
Sent: Thursday, November 05, 2015 03:11
To: [email protected]
Subject: Re: hadoop not using whole disk for HDFS
What does your dfs.datanode.data.dir point to ?
On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob" Wakefield, MBA
<[email protected]> wrote:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 50G 12G 39G 23% /
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 16G 1.4G 15G 9% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sda2 494M 123M 372M 25% /boot
/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
That’s from one datanode. The second one is nearly identical. I discovered
that 50GB is actually a default. That seems really weird. Disk space is cheap.
Why would you not just use most of the disk and why is it so hard to reset the
default?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Chris Nauroth
Sent: Wednesday, November 04, 2015 12:16 PM
To: [email protected]
Subject: Re: hadoop not using whole disk for HDFS
How are those drives partitioned? Is it possible that the directories
pointed to by the dfs.datanode.data.dir property in hdfs-site.xml reside on
partitions that are sized to only 100 GB? Running commands like df would be a
good way to check this at the OS level, independently of Hadoop.
--Chris Nauroth
From: MBA <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Tuesday, November 3, 2015 at 11:16 AM
To: "[email protected]" <[email protected]>
Subject: Re: hadoop not using whole disk for HDFS
Yeah. It has the current value of 1073741824 which is like 1.07 gig.
B.
From: Chris Nauroth
Sent: Tuesday, November 03, 2015 11:57 AM
To: [email protected]
Subject: Re: hadoop not using whole disk for HDFS
Hi Bob,
Does the hdfs-site.xml configuration file contain the property
dfs.datanode.du.reserved? If this is defined, then the DataNode intentionally
will not use this space for storage of replicas.
<property>
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
</property>
--Chris Nauroth
From: MBA <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Tuesday, November 3, 2015 at 10:51 AM
To: "[email protected]" <[email protected]>
Subject: hadoop not using whole disk for HDFS
I’ve got the Hortonworks distro running on a three node cluster. For some
reason the disk available for HDFS is MUCH less than the total disk space. Both
of my data nodes have 3TB hard drives. Only 100GB of that is being used for
HDFS. Is it possible that I have a setting wrong somewhere?
B.