On Mon, Oct 12, 2009 at 1:46 AM, Dan Harvey <[email protected]> wrote:
>
> A second question is more tying to understand the way in which to use
> HBase.
> If we have documents that have many authors, which themselves have a
> varying amount of metadata, how is a good approach to store this? From
> reading about HBase I see it could be done using a column family on the
> document for say author_name:, author_email: but if there are an unknown
> number of author properties this probably isn't the best way.. Would using
> a separate table be better to store the author data in?
>
>
I think something like this would likely work best for you:
column family author
column qualifier is the key to the author in the authors table
data is the denormalized data about the author you are likely to need when
you read the document, encoded any way you like (comma separated values,
JSON, etc)
So you have
column: author:1234
value: {name:"John Smith",email:"[email protected]"}
column: author:1268
value: {name:"Dan Harvey",email:"[email protected]",webpage:"..."}