Thanks Jean-daniel. I did go through  the documentation, but there was no clear 
answer to interleaving puts from two or more row keys or if there was a way to 
reserve contiguous blocks per rowkey. I made some derivations but clearly, I 
was incorrect in some of them as you pointed out  too. The questions were 
partly validations and partly doubt-riddance. :)

Thanks
Abhishek 

i Sent from my iPad with iMstakes 

On Aug 23, 2012, at 17:19, "Jean-Daniel Cryans" <[email protected]> wrote:

> Inline. In general I'd recommend you read the documentation more
> closely and/or get the book.
> 
> J-D
> 
> On Thu, Aug 23, 2012 at 4:21 PM, Pamecha, Abhishek <[email protected]> wrote:
>> 1.       Can there be multiple row keys per block and then per  HFile? Or is 
>> a block or Hfile dedicated to a single row key?
> 
> Multiple row keys per HFile block. Read
> http://hbase.apache.org/book.html#hfilev2
> 
>> I have a scenario, where for the same column family, some rowkeys will have 
>> very wide rows, say rowkey W, and some rowkeys will have very narrow rows, 
>> say rowkey N. In my case,  puts for rowkeys W and N are interleaved with a 
>> ratio of say 90 rowkeyW puts vs 10 rowkeyN puts. On the get side, my app 
>> works on getting data for a single  rowkey at a time.
>> Will that mean for a rowkeyN, the entries will be scattered across regions 
>> on that same region server, given there are interleaved puts? Or Is there a 
>> way I can enforce contiguous  writes to a region/Hfile reserved for rowkey 
>> N.  This way, I can leverage the block cache and have the entire/most of  
>> rowkeyN fit in there for that session.
> 
> The row keys are sorted according to their lexicographical order. See
> http://hbase.apache.org/book.html#row
> 
> If you don't want the big rows coexisting with the small rows, put
> them in different column families or different tables.
> 
>> 2.       Is there a limit on number of HFiles that can exist per region?
> 
> I think your understanding of HFiles being a bit wrong prompted you to
> ask this, my previous answers probably make it so that you don't need
> this answer anymore, but there it is just in case:
> 
> The HFiles are compacted when reaching
> hbase.hstore.compactionThreshold (default of 3) per family, and you
> can have no more than hbase.hstore.blockingStoreFiles (default of 7).
> 
> " Basically, on what criteria does a rowkey data gets split in two
> regions [on the same region server]. I am assuming there can be many
> regions per region server. And multiple regions for the same table can
> belong in the same region server.
> 
> A row key only lives in a single region since the regions are split
> based on row keys.
> 
>> 3.       Also, is there a limit on the number of blocks that are created per 
>> HFile?
> 
> No.
> 
>> What determines whether a split is required?
> 
> hbase.hregion.max.filesize, also see
> http://hbase.apache.org/book.html#disable.splitting if you want to
> change that.

Reply via email to