Hello, everyone.
I'd love to hear folks' input on using the "natural" data model of Accumulo
("BigTable" style) vs more of a Document Model. I'll try to succinctly
describe with a contrived example.
Let's say I have one domain object I'd like to model, "SensorReadings". A
single entry might look
om>
To: user@accumulo.apache.org
Sent: Friday, September 4, 2015 11:42:20 AM
Subject: Accumulo: "BigTable" vs. "Document Model"
Hello, everyone.
I'd love to hear folks' input on using the "natural" data model of Accumulo
("BigTable" style) vs more of a Document
You could use a server-side iterator that does the filtering on the server,
and returns a protobuf value for matching rows.
-Eric
On Fri, Sep 4, 2015 at 11:42 AM, Michael Moss
wrote:
> Hello, everyone.
>
> I'd love to hear folks' input on using the "natural" data model
Sqrrl uses a hybrid approach. For records that are relatively static we use
a compacted form, but for maintaining aggregates and for making updates to
the compacted form documents we use a more explicit form. This is done
mostly through iterators and a fairly complex type system. The big
trade-off
These days, I tend to lean towards breaking out each attribute in a
record into discrete columns.
When you roll up multiple columns into a single value, you lose the
ability to use the native column filtering (cf or cf+cq) that's built
into Accumulo. Same goes for column visibilities (at
+1 for Eric's suggestion. I used this technique. It seemed to work nicely.
When storing ProtoBuf, JSON, or any other 'document' remember to factor in
the parsing needed during iteration. This affects both CPU and Memory
requirements on the tservers.
On Fri, Sep 4, 2015 at 11:53 AM, Eric Newton