A region is only served by 1 region server, and since HBase uses the HDFS client it doesn't have a view of the blocks layout. HBase currently doesn't even know about replication, it asks to read a file and gets some data coming from somewhere (that somewhere is determined by HDFS).
Hope this helps, J-D On Mon, Jun 18, 2012 at 11:16 AM, IGZ Nick <[email protected]> wrote: > Hi folks, > > Here is how I understand the scan flow (A regular sequential scan from key > A to key B): > - Zookeeper is contacted for the RegionServer that has the -ROOT- regions. > - The -ROOT- RS is contacted and it gets you the RS for .META. > - The .META. is contacted, and it will give you all regions for keys from A > to B - e.g, A to A1 resides in reg1, A1 to A2 in reg2, A2 to B in reg3. > > Now if HDFS replication is set to 3, there must be 3 RS which will have > reg1, and likewise for reg2 and reg3. So how does the client figure out > which RS to go to? Or am I completely wrong here? > As a follow up, if reg3 is present in RS1, RS2 and RS3, then does the > client get all the data from A1 to A2 from a single RS or is there some > sort of splitting like A1 to A11 can come from RS1, A11 to A12 from RS2 and > A12 to A2 from RS3. That would be faster, right? Put another way, if my > scan consists of only one region, which is hosted on three RegionServers, > does the data come in from all 3 RS's or just one of them? > > Thanks a lot, > Nick
