Hello Hbase Users,
I'm trying to decide what database schema is better, more precisely how
to partition my data.
Is it better to have few tables with a lot of keys
OR
a lot of tables with fewer keys?
For instance if I want to store are articles written by user
Say we have U users (U=100000) and each user 'u' writes A_u articles (in
average 10000 articles per user)
Is it better to create a table per user
e.g. table "articles_for_user_1"
and in that table a column family 'article'
with key based on the date of the article YYYMMDDMMss
or one large table called "articles"
whith one column family 'article'
and keys based on user ID + date: user1_YYYMMDDMMss
If I make one table per user, do I have the risk of hitting the nodes
memory limits if the number of users grows?
If I just have one big table, will search by key be to slow?
I think this question reveals my lack of knowledge of of how Hbase
stores the data...
Thanks
Alex