John Clarke wrote:
Hi,
I am working on a project that is suited to Hadoop and so want to create a
small cluster (only 5 machines!) on our servers. The servers are however
used during the day and (mostly) idle at night.
So, I want Hadoop to run at full throttle at night and either scale back or
suspend itself during certain times.
You could add/remove new task trackers on idle systems, but
* you don't want to take away datanodes, as there's a risk that data
will become unavailable.
* there's nothing in the scheduler to warn that machines will go away at
a certain time
If you only want to run the cluster at night, I'd just configure the
entire cluster to go up and down