> If you could answer any of these > following questions, I would greatly grateful for that.
People usually give me beer in exchange for quick help, let me know if that works for you ;) > > 1. For cell size, why it should not be larger than 20m in general? General answer: it pokes HBase in all the corner cases. You have to change a lot of default configs in order to keep some sort of efficiency. > > 2. What is the block size if the cell is 20m? Can a cell covers multiple > blocks? No, one HFile block per cell (KeyValue) in this case. It basically gives you a perfect index. > > 3. For single cell column family (it has only one cell), does it share > the same size limit as cell? In other words, does single column family > should be smaller than 20m? It's the same to me. > > 4. Is there any advantage to put rows close in HBase, if these rows > have a high chance to be queried together? If you do Scans, then you want your rows together right? > > 5. Any general rule for row size? Try not to go into the MBs, it's currently missing some optimizations that would make this use case work perfectly. > > 6. Where does the HReigion host the row keys in HFile or other files? Block index in HFile, not all the row keys are there if a single block fits more than one row. J-D
