Just specify multiple directories (where the different local partitions are mounted) for dfs.data.dir (hdfs data) and mapred.local.dir (intermediate data) in hdfs-site.xml.Data should then be striped across the different partitions/disks.
See here : http://bit.ly/fbUkr Dali On Mon, Sep 14, 2009 at 11:50 AM, Stas Oskin <[email protected]> wrote: > Hi. > > Thanks for the explanation. > > Any idea if I can re-use this round robin mechanism for local disk writing? > > Or it's DFS only? > > Regards. > > 2009/9/14 Jason Venner <[email protected]> > > > When you have multiple partitions specified for hdfs storage, they are > used > > for block storage in a round robin fashion. > > If a partition has insufficient space it is dropped for the set used for > > storing new blocks. > > > > On Sun, Sep 13, 2009 at 3:01 AM, Stas Oskin <[email protected]> > wrote: > > > > > Hi. > > > > > > When I specify multiple disks for DFS, does Hadoop distributes the > > > concurrent writings over the multiple disks? > > > > > > I mean, to prevent an utilization of a single disk? > > > > > > Thanks for any info on subject. > > > > > > > > > > > -- > > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > > http://www.amazon.com/dp/1430219424?tag=jewlerymall > > www.prohadoopbook.com a community for Hadoop Professionals > > > -- Dali Kilani =========== Phone : (650) 492-5921 (Google Voice) E-Fax : (775) 552-2982
