Hi Is there a way to partition HDFS [replication factor, say 3]] or route requests to specific RS nodes so that
One set of nodes serve operations like put and get etc. Other set of nodes do MR on the same replicated data set And those two sets don't share the same nodes? I mean, If we are replicating and not worried about consistency equally across all replicas, can we allocate different jobs to different replicas based on that replica's consistency tuning. I understand that HDFS interleaves replicated data across nodes so we don't have cookie-cut isolated replicas. And thus this question becomes more interesting? :) An underlying question is how a node of its 2 other replicas, gets chosen for a specific request[ put/get] or a MR job. Thanks, Abhishek
