> If you could answer any of these
> following questions, I would greatly grateful for that.

People usually give me beer in exchange for quick help, let me know if
that works for you ;)

>
> 1. For cell size, why it should not be larger than 20m in general?

General answer: it pokes HBase in all the corner cases. You have to
change a lot of default configs in order to keep some sort of
efficiency.

>
> 2. What is the block size if the cell is 20m? Can a cell covers multiple 
> blocks?

No, one HFile block per cell (KeyValue) in this case. It basically
gives you a perfect index.

>
> 3. For single cell column family (it has only one cell), does it share
> the same size limit as cell? In other words, does single column family
> should be smaller than 20m?

It's the same to me.

>
> 4. Is there any advantage to put rows close in HBase, if these rows
> have a high chance to be queried together?

If you do Scans, then you want your rows together right?

>
> 5. Any general rule for row size?

Try not to go into the MBs, it's currently missing some optimizations
that would make this use case work perfectly.

>
> 6. Where does the HReigion host the row keys in HFile or other files?

Block index in HFile, not all the row keys are there if a single block
fits more than one row.

J-D

Reply via email to