You could use a server-side iterator that does the filtering on the server,
and returns a protobuf value for matching rows.

-Eric


On Fri, Sep 4, 2015 at 11:42 AM, Michael Moss <michael.m...@gmail.com>
wrote:

> Hello, everyone.
>
> I'd love to hear folks' input on using the "natural" data model of
> Accumulo ("BigTable" style) vs more of a Document Model. I'll try to
> succinctly describe with a contrived example.
>
> Let's say I have one domain object I'd like to model, "SensorReadings". A
> single entry might look something like the following with 4 distinct CF, CQ
> pairs.
>
> RowKey: DeviceID-YYYMMDD-ReadingID (i.e. - 1-20150101-1234)
> CF: "Meta", CQ: "Timestamp", Value: <Some timestamp>
> CF: "Sensor", CQ: "Temperature", Value: 80.4
> CF: "Sensor", CQ: "Humidity", Value: 40.2
> CF: "Sensor", CQ: "Barometer", Value: 29.1
>
> I might do queries like "get me all SensorReadings for 2015 for DeviceID =
> 1" and if I wanted to operate on each SensorReading as a single unit (and
> not as the 4 'rows' it returns for each one), I'd either have to aggregate
> the 4 CF, CQ pairs for each RowKey client side, or use something like the
> WholeRowIterator.
>
> In addition, if I wanted to write a query like, "for DeviceID = 1 in 2015,
> return me SensorReadings where Temperature > 90, Humidity < 40, Barometer >
> 31", I'd again have to either use the WholeRowIterator to 'see' each entire
> SensorReading in memory on the server for the compound query, or I could
> take the intersection of the results of 3 parallel, independent queries on
> the client side.
>
> Where I am going with this is, what are the thoughts around creating a
> Java, Protobuf, Avro (etc) object with these 4 CF, CQ pairs as fields and
> storing each SensorReading as a single 'Document'?
>
> RowKey: DeviceID-YYYMMDD
> CF: ReadingID Value: Protobuf(Timestamp=123, Temperature=80.4,
> Humidity=40.2, Barometer = 29.1)
>
> This way you avoid having to use the WholeRowIterator and unless you often
> have queries that only look at a tiny subset of your fields (let's say just
> "Temperature"), the serialization costs seem similar since Value is just
> bytes anyway.
>
> Appreciate folks' experience and wisdom here. Hope this makes sense, happy
> to clarify.
>
> Best.
>
> -Mike
>
>
>
>
>

Reply via email to