On Mon, Jun 18, 2012 at 11:34 AM, IGZ Nick <[email protected]> wrote: > Hi Jean, > > Thank you for your reply. So RS is a completely different entity when > compared to the datanode?
Totally. > How does RS server the data? That's HBase 101, I recommend you read the guide http://hbase.apache.org/book/book.html or the book http://ofps.oreilly.com/titles/9781449396107/ or the bigtable paper. > I can view the > region directories in HDFS. So the same region must be on 3 datanodes, > right? Yep. > Then which regionserver gets to serve that region? HBase 101, but in short the master decides that. > Is it a > completely random regionserver? The master uses a few heuristics. > And if I ask that region server for all > keys from that region, will it have to come from the same HDFS datanode? Depends if the data is there, if it is then it will be served locally else it will be fetched. It doesn't really matter to the region server since the HDFS client handles it transparently. > As > far as I understand, in HDFS, if I stream a file, then I get the data from > a single datanode (the one closest to the client, usually). So, in HBase, I > ask for all keys in region reg1, then I get all the keys from the datanode > that is closest to the client? Yep J-D
