Hi, I am trying to understand these two properties if i use multiple disks/mount points,
For example i have a server with 3 100gb disk mounted on /data1,/data2,/data3 and if i use them for both data.dir and name.dir do i get total ~300gb disk space for the data or i only get 100gb and other two disks are for redundant purpose only? This is the description i got from hadoop docs: dfs.namenode.name.dir: Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. dfs.datanode.data.dir: Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories that do not exist will be created if local filesystem permission allows. >From the above description i understand only namenode table will get replicated in 3 disks but not sure how it works if i have multiple disks for data dir. I wanted to use all available disk (3:300gb) in a server for data, so can i just use comma seperated dir list or should i do raid or lvm to combine those disks? Thanks, Ram
