On Sep 15, 2011, at 2:26 AM, Steve Loughran wrote: > These are all good ideas. The other trick -which has been discussed recently > in the context of the Platform Scheduler- is to run HDFS across all nodes, > but switch the workload of the cluster between Hadoop jobs (MR, Graph, > Hamster), and other work (Grid jobs). That way the filesystem is just a very > large FS for anything. If some grid jobs don't use the HDFS, the nodes can > still serve up their data.
Or, one can port other work to run on MR2 ;) Arun