I usually set mapred.local.dir to share the disk space with DFS, since some mapreduce job need big temp space.
On Fri, Apr 3, 2009 at 8:36 PM, Craig Macdonald <[email protected]>wrote: > Hello all, > > Following recent hardware discussions, I thought I'd ask a related > question. Our cluster nodes have 3 drives: 1x 160GB system/scratch and 2x > 500GB DFS drives. > > The 160GB system drive is partitioned such that 100GB is for job > mapred.local space. However, we find that for our application, mapred.local > free space for map output space is the limiting parameter on the number of > reducers we can have (our application prefers less reducers). > > How do people normally work for dfs vs mapred.local space. Do you (a) share > the DFS drives with the task tracker temporary files, Or do you (b) keep > them on separate partitions or drives? > > We originally went with (b) because it prevented a run-away job from eating > all the DFS space on the machine, however, I'm beginning to realise the > disadvantages. > > Any comments? > > Thanks > > Craig > >
