Accumulo: "BigTable" vs. "Document Model"

2015-09-04 Thread Michael Moss
Hello, everyone. I'd love to hear folks' input on using the "natural" data model of Accumulo ("BigTable" style) vs more of a Document Model. I'll try to succinctly describe with a contrived example. Let's say I have one domain object I'd like to model, "SensorReadings". A single entry might look

Re: Accumulo: "BigTable" vs. "Document Model"

2015-09-04 Thread dlmarion
om> To: user@accumulo.apache.org Sent: Friday, September 4, 2015 11:42:20 AM Subject: Accumulo: "BigTable" vs. "Document Model" Hello, everyone. I'd love to hear folks' input on using the "natural" data model of Accumulo ("BigTable" style) vs more of a Document

Re: Accumulo: "BigTable" vs. "Document Model"

2015-09-04 Thread Eric Newton
You could use a server-side iterator that does the filtering on the server, and returns a protobuf value for matching rows. -Eric On Fri, Sep 4, 2015 at 11:42 AM, Michael Moss wrote: > Hello, everyone. > > I'd love to hear folks' input on using the "natural" data model

Re: Accumulo: "BigTable" vs. "Document Model"

2015-09-04 Thread Adam Fuchs
Sqrrl uses a hybrid approach. For records that are relatively static we use a compacted form, but for maintaining aggregates and for making updates to the compacted form documents we use a more explicit form. This is done mostly through iterators and a fairly complex type system. The big trade-off

Re: Accumulo: "BigTable" vs. "Document Model"

2015-09-04 Thread Josh Elser
These days, I tend to lean towards breaking out each attribute in a record into discrete columns. When you roll up multiple columns into a single value, you lose the ability to use the native column filtering (cf or cf+cq) that's built into Accumulo. Same goes for column visibilities (at

Re: Accumulo: "BigTable" vs. "Document Model"

2015-09-04 Thread David Medinets
+1 for Eric's suggestion. I used this technique. It seemed to work nicely. When storing ProtoBuf, JSON, or any other 'document' remember to factor in the parsing needed during iteration. This affects both CPU and Memory requirements on the tservers. On Fri, Sep 4, 2015 at 11:53 AM, Eric Newton