[ http://issues.apache.org/jira/browse/HADOOP-64?page=comments#action_12426366 ] Yoram Arnon commented on HADOOP-64: -----------------------------------
dfs.data.dir is currently used to specify the location of temporary files written by dfs client (data is written to disk, then an entire dfs block is streamed to the datanodes). Rather than trying to support a multiple-volume behaviour there too, let's separate the client config from the datanode config, using 'client.tempdata.dir'. Try to make the change backwards compatible. read-only drives are hard to maintain except by totally ignoring them, since data can not be deleted from them. If a file is deleted, then a blockid is reclaimed for another file, bad things might happen if that blockid is served by some read-only volume. If it's the last copy of a block, *and* the volume is read-only and on its way to be dead, then that block is unfortunately lost. round robin is a bit harsh as an allocation scheme. allocation proportional to free space would work better IMO. > DataNode should be capable of managing multiple volumes > ------------------------------------------------------- > > Key: HADOOP-64 > URL: http://issues.apache.org/jira/browse/HADOOP-64 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.2.0 > Reporter: Sameer Paranjpye > Assigned To: Milind Bhandarkar > Priority: Minor > Fix For: 0.6.0 > > > The dfs Datanode can only store data on a single filesystem volume. When a > node runs its disks JBOD this means running a Datanode per disk on the > machine. While the scheme works reasonably well on small clusters, on larger > installations (several 100 nodes) it implies a very large number of Datanodes > with associated management overhead in the Namenode. > The Datanod should be enhanced to be able to handle multiple volumes on a > single machine. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
