Beyond the physical limitations (cost constraints) there's a logical one in 
terms of design. 

I just did a talk at the CHUG on schema design and the key was to understand 
how and why one should use column families. 

From a logical design perspective you would want to limit data within a CF to 
data that you grab all at once. Meaning that when you do your scan / get, you 
want to minimize the column families that you have to hit. 

So you need to think about how you approach organizing your data. 

The best example of this is to look at an order entry system where the column 
families are broken out in to Order Entry, Pick Slips, Shipping and Invoices. 

While they all use the same key (customer number | order number) the data for 
each part of the order entry through fulfillment is accessed separately. 

So even in this example, you have 4 column families in use for this one table. 

HTH

-Mike

On Jun 28, 2013, at 7:27 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning

Reply via email to