This should probably go to [email protected] mailing list since it's an 
HDFS specific question and not Ambari related.

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html

That being said they are two different setting the namenode data dir just 
stores metadata about the filesystem. The datanode data dir stores actual hdfs 
blocks if you have three 100gb directories you will have 300gb of DFS space but 
by default all blocks are replicated 3 times. You don't want to use LVM or RAID 
just raw disk. 

> On Jun 9, 2016, at 6:06 PM, rammohan ganapavarapu <[email protected]> 
> wrote:
> 
> Hi,
> 
> I am trying to understand these two properties if i use multiple disks/mount 
> points, 
> 
> For example i have a server with 3 100gb disk mounted on /data1,/data2,/data3 
> and if i use them for both data.dir and name.dir do i get total ~300gb disk 
> space for the data or i only get 100gb and other two disks are for redundant 
> purpose only?
> 
> This is the description i got from hadoop docs:
> dfs.namenode.name.dir:
> 
> Determines where on the local filesystem the DFS name node should store the 
> name table(fsimage). If this is a comma-delimited list of directories then 
> the name table is replicated in all of the directories, for redundancy.
> 
> dfs.datanode.data.dir:
> 
> Determines where on the local filesystem an DFS data node should store its 
> blocks. If this is a comma-delimited list of directories, then data will be 
> stored in all named directories, typically on different devices. The 
> directories should be tagged with corresponding storage types 
> ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS storage policies. The default 
> storage type will be DISK if the directory does not have a storage type 
> tagged explicitly. Directories that do not exist will be created if local 
> filesystem permission allows.
> 
> From the above description i understand only namenode table will get 
> replicated in 3 disks but not sure how it works if i have multiple disks for 
> data dir.
> 
> I wanted to use all available disk (3:300gb) in a server for data, so can i 
> just use comma seperated dir list or should i do raid or lvm to combine those 
> disks?
> 
> Thanks,
> Ram
> 
> 
> 

Reply via email to