Re: Teating large numbers of slaves with scheduled downtime

2006-07-24 Thread Doug Cutting
The easiest way would be to not use anything but your reliable machines as datanodes. Alternately, for better performance, you could run two DFS systems, one on all machines, and one on just the reliable machines, and back one up to the other before you shutdown the unreliable nodes each

Re: Task type priorities during scheduling ?

2006-07-24 Thread Paul Sutter
doug, it doesnt matter how the code is structured, what does matter is that the reduce phase and shuffle phase have very different timelines and resource requirements and should not both be charged the the number of reduce tasks permitted. it should be possible to have lots of tasks in the