On 6/19/09 3:49 AM, "Harish Mallipeddi" <harish.mallipe...@gmail.com> wrote: > Why do you want to do this in the first place? It seems like you want > cluster1 to be a plain HDFS cluster and cluster2 to be a mapred cluster. > Doing something like that will be disastrous - Hadoop is all about sending > computation closer to your data. If you don't want that, you need not even > use hadoop.
Given some of the limitations with HDFS (quota operability, security), I can easily why it would be desirable to have static data coming from one grid while doing computation/intermediate outputs/real output to another. Using performance as your sole metric of viability is a bigger disaster waiting to happen. "Sure, we crashed the file system, but look how fast it went down in flames!"