I'm looking for feedback/ideas/suggestions/warnings on the following subject:
We really need a mechanism for foreground and background jobs. We have jobs that take 12 to 48 hours to complete, and we have several developers who also need to run/test reasonable sized jobs throughout the day. Here's what we're considering: We're thinking of having two accounts: blue and yellow, each of which with a complete copy of Hadoop, and each of which can activate a jobtracker running on different ports. They would each run tasktrackers on all nodes (on different ports), and operate independently, but share one instance of DFS because thats where all of our data is (maybe we'd run DFS out of a third account, lets call it dfs). The only difference is that the tasks run under yellow would be nice'd (thus the name: Ganglia shows nice CPU as yellow and normal CPU as blue). This would allow us to make code changes anywhere in Hadoop, or our code, and stop and restart the cluster, independently, without blue interfering with yellow, and vice versa. We have enough RAM on our cluster to make this work. Any problems with this? Any better ideas? Thanks! Paul Sutter
