Hi,

HBase: The Definitive Guide book's chapter 9 talks about Tall-Narrow vs 
Flat-wide tables. (http://ofps.oreilly.com/titles/9781449396107/advanced.html)

It seems to propose that Tall-Narrow tables (more rows, less columns) is better 
design.  One of the issue it talks about with "Flat-wide" tables (less rows and 
more columns) is
...
In addition, HBase can only split at row boundaries, which also enforces the 
recommendation to go with tall-narrow tables. Imagine you have all emails of a 
user in a single row. This will work for the majority of users, but there will 
be outliers that will have magnitudes of emails more in their inbox. So much so 
that a single row could outgrow the maximum file/region size and work against 
the region split facility.
...

So, my query is that is it a bad idea to have a table as given in above example 
wherein emails are stored by adding columns.   I seem to have a similar table 
in my application, wherein I have a region size of 1GB and cell value of 10KB.  
So, will I run into region-split issue mentioned above after 100000 (1GB / 10KB 
= 100000)  columns.

Regards,
Srikanth

________________________________

http://www.mindtree.com/email/disclaimer.html

Reply via email to