Hi Bryce,

If each session ID is unique, even with multiple writers is unlikely for you to be writing to the same key at the same time from two different writers, that being the case, you could store each event as a JSON object into a JSON array and when your array reaches a threshold, say for a sake of an example; 2000, you could move those entries to another key, so your JSON will look like:

session12345 {
  next : 10,
  entries : [
     ...,...,...
  ]
}

session12345-10 {
  next : 9,
  entries : [
     ...,...,...
  ]
}

...
...
...

session12345-1 {
  entries : [
     ...,...,...
  ]
}

The reason while next is decrementing is because entries are sorted descending by time (newest events 1st) and balanced keys are inserted in between, so initially you have sesssion12345 and when you get to 2001 events you have session12345 with next pointing to 1 (or 0 base) and sesssion12345-1, null if that session hasn't been balanced.

I'm using the hyphen as a convention, it can be any separator you want,

Hope that helps,

Guido.

On 24/04/14 04:29, Jason Campbell wrote:
Hi Bryce,

I have code that does something similar to this, and it works well.

In my case, the value is a JSON array, with a JSON object per event.

Siblings are easily resolved by merging the two arrays.

In Riak 2.0, sets using JSON-encoded strings would probably do this
automatically and more cleanly this manually resolving siblings.

I like sorted JSON, but any data format that produces identical strings
would work.  If there is a chance of duplicate submissions into Riak,
you need to ensure the data format always produces identical output to
allow Riak to recognize and eliminate duplicates.

The other thing I would worry about is how long-lived your sessions are
and how many events can they generate.  Riak starts having performance
issues over a few MB and you should probably consider another data
model at that point (maybe storing references instead of the data
itself).

Good luck with your project,
Jason

----- Original Message -----
From: "Bryce" <[email protected]>
To: "riak-users" <[email protected]>
Sent: Thursday, 24 April, 2014 1:22:12 PM
Subject: Riak as log aggregator

Hi All,

I'm interested in using Riak for a log aggregation project. These are
basically apache logs that I would like to correlate together based on
their session ID's. These session ID's would make sense as the key, but
its the "value" part of this that confuses me. There will be multiple
lines within these logs that have the same session ID, thus I will be
creating siblings. Now, is there a CRDT that will allow me to combine
all of these siblings into a single value or will I need to write my own
solution to do so? Any and all pointers are welcomed. Also, if Riak is a
bad fit for this, please let me know.

Warm regard,
Bryce


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to