@Wilm, 

Let me put it a different way… 

Think of a sales invoice. 

You can have columns for invoice_id, customer_id, customer_name, 
customer_billing_address (Nested structure), customer_contact# (nested 
structure), ship_to (nested structure)… 
And that’s the header information. 

Add to that the actual invoice line items… (row#, SKU#, description, qty, 
unit_price, line_price, tax-code) … [Note: this is also nested]

How do you have a single column family to handle all of that? 

Again, when you look at designs with respect to a real use case, you start to 
see where they fall apart. 

If we take a long look at what HBase is, and is not, we can start to see how we 
would want to model the data and how to better organize the data. 

I don’t want to morph this thread in to a more theoretical discussion on 
design, but this isn’t a new thing. 
Informix had project Arrowhead back in the late 90’s that got killed when Janet 
Perna bought them.  Had that project not been killed, the landscape would be 
very different. 
(And that’s again another story. ;-) 

But I digress. 

The point I’m trying to make is that when you start to look at the data, where 
you would have a Master/Slave relationship in terms of the data, you can 
replace it with some sort of array/list structure in a single column since 
everything is a blob.   (And again there are areas where you can impose more 
constraints on hbase and make it either more in to a relational model or in to 
a hierarchal model. and this would again be a different discussion.)

HTH

-Mike

On Sep 10, 2014, at 10:25 PM, Wilm Schumacher <[email protected]> 
wrote:

> 
> 
> Am 10.09.2014 um 22:25 schrieb Michael Segel:
>> Ok, but here’s the thing… you extrapolate the design out… each column
>> with a subordinate record will get its own CF.
> I disagree. Not by the proposed design. You could do it with one CF.
> 
>> Simple examples can go
>> very bad when you move to real life.
> I agree.
> 
>> Again you need to look at hierarchical databases and not think in
>> terms of relational. To give you a really good example… look at a
>> point of sale system in Pick/Revelation/U2 …
>> 
>> You are great at finding a specific customer’s order and what they
>> ordered. You suck at telling me how many customers ordered that
>> widget  in red.  during the past month’s promotion. (You’ll need to
>> do a map/reduce for that. )
> correct, that's the downside of the suggestion. If you want to query
> something like that ("give all 'toplevel columns' that that have this
> and that!"), you would have to make a map reduce. Or you need something
> like an index. But that's a question only the thread owner can answer
> because we don't know what he's trying to accomplish. If there is a
> chance that he want to query something like that, my suggestion would be
> a bad plan.
> 
> I think the thread owner has now 3 ideas how to do what he was asking
> for, with up and downsides. Now he has to decide what's the best plan
> for the future.
> 
> Best wishes,
> 
> Wilm
> 

Reply via email to