I thought about protobufs  - That's probably the most straightforward
since it is easily convertible to and from byte arrays.  I guess what
I'm thinking about is a ''standard" serialization mechanism so I can
put a pretty face on the HBase data access API without having to do
the serialization and deserialization myself.  I I'm probably just
being lazy :)

On Fri, Apr 17, 2009 at 1:05 PM, Jonathan Gray <[email protected]> wrote:
> Tom,
>
> HBase is certainly capable of doing something like this.  And I'm
> currently doing things like it in production.
>
> As a binary store, you can use any kind of serialized type you want (we
> store everything from json and protobufs to serialized java, python, and
> erlang data structures).  That often includes enforcement of type,
> required/optional fields, length, etc...
>
> What you're asking is if there is a way to integrate this more directly,
> with custom serializers/deserializers that would do enforcement?
>
> What I will say is the new design for 0.20 that we are currently testing
> includes a rework of the client/server protocol towards something more
> language-agnostic (likely not fully there for 0.20 but soon after
> hopefully).  Even for 0.20 though, for PUTS, the actually binary that will
> eventually be stored into HFiles (called a KeyValue) is being built
> client-side and sent to the server.  GETS will return the same thing.  In
> both cases, what you basically send between the client and server are
> lists of KeyValues, these can then be built into existing structures like
> RowResult/Cell or interpreted in any way you'd like.
>
> That basically means you can do anything you want as far as the
> serialization/deserialization goes of what you're storing in HBase.
>
> I've not really found a need to further integrate typed information... But
> I also have no problem adding complexity to the app level, that's a
> decision made long ago and it's what has allowed us to do so much with
> HBase.
>
> Putting flexibility into the hands of the client seems like a good way to
> go, keeping HBase as simple as possible ("just" a KeyValue store).
>
> JG
>
>
> On Fri, April 17, 2009 8:21 am, Tom Nichols wrote:
>> Hi,
>>
>>
>> I've been using HBase and now I'm looking at Cassandra.  What's
>> particularly interesting about Cassandra is its typed data model.
>> Apparently it involves JSON, but what matters the most to me is that
>> it makes storage of complex data types much easier.  It is described here:
>> http://project-voldemort.com/design.php about half-way down the
>> page.  Obviously JSON serialization & deserialization adds overhead but the
>> ability to choose a strongly-typed storage format seems nice.
>>
>> Any thoughts of this functionality in the HBase API?  Not necessarily
>> JSON in particular, but a pluggable serialization/ deserialization
>> mechanism?  I imagine this could be done completely on the client, but
>> having something standard so every user doesn't have to roll their own
>> (and having the same functionality in HB/MapReduce) would be nice.
>>
>>
>> Thanks.
>> -Tom
>>
>>
>>
>
>

Reply via email to