Re: model for a time index on realtime data

Alexander Sicular Mon, 12 Apr 2010 13:19:06 -0700

Hi Colin,

You present an interesting use case whose exploration would help many people on 
this list. I'll comment from my own experiences with the Riak HTTP interface, 
which is where I primarily work and no inside knowledge of what is on the Riak 
roadmap. Note that everything you can do in HTTP land will perform at least 
marginally better to possibly significantly better in native Erlang. The way I 
see it, here are some of my (non-exhaustive) considerations. Key size, which 
includes both data size and header size, listing keys in a bucket, searching 
for keys, M/R and backend persistent storage. I'll review them in order, then I 
will move on to a potential solution. .

Key size is a consideration because Riak will not retrieve partial keys or only 
links. They seem to be housed in the same internal data structure. So this is a 
consideration when thinking about how to retrieve/update/save your data. Also 
every update is a wipe. So you can't update just some of the links, you need to 
update all of them. You can't update some of your data, you need to update all 
of it. Not that this is not a bad thing per-se, it's just that one should be 
aware of these limitations when designing. While we are on links, you 
specifically mention "time capsule buckets". Links are manifested at the key 
level and not at the bucket level. Riak links link keys to other keys 
regardless of bucket membership. 

A significant deficiency that I have found with Riak is the costly list keys 
function which practically necessitates the need for an external index, re. 
Redis. If you have a real-time bucket where all new records are being written 
to and a periodic data flush that takes some past time slice of data and puts 
it somewhere else for archival then you most definitely need an external 
indexing mechanism to GET those keys if you want anything close to real-time 
performance. Also, remember there is no "MOVE" function there is only a GET, 
PUT/POST and DELETE function, so you'll kinda need to roll your own 
transactions. I've also done this without Redis and just using a specific 
"index" key in a separate stats bucket in RIak. So I would have a bucket called 
"real-time-data" and another bucket called "real-time-data-stats". The list 
keys function completion time increases (linearly or worse depending on number 
of nodes, possibly mitigated by streaming keys?) at some function related to 
total number of keys in a bucket. 

A separate inconvenience is key retrieval. You can not retrieve a key or range 
of keys from Riak with a regex. You actually need the full key name. It is 
harder to deform keys in the primary retrieval mechanism (Riak) but that does 
not stop you from using descriptive keys like YYYYMMDDHHMMSS to your advantage 
in post processing like m/r. Note that Redis does allow you to do such key 
retrieval. Also a consideration is that keys are not stored (retrieved) in any 
order. This is of specific concern when dealing with time series data, or any 
non idempotent data.

AFAIK, Riak will accept only buckets or key lists as input for m/r functions. 
The former will simply run it's own list keys function to generate all the keys 
it will iterate over. I think the more performant way to do this would be to 
create an m/r from your own key list based off of a key index you maintain in 
Redis with periodic flush to it's own Riak bucket index bucket with key names 
based on your own time frequency. A Riak m/r itself can be stored in a Riak key 
so once you generate it you can keep it somewhere and retrieve it as easily you 
would a key. If these are M/R jobs that you will be constantly executing you 
should look into storing them in the specific M/R javascript script directory. 
This will give you the added performance advantage of pre-caching those 
functions. 

Something to consider is that actual backend you will use. In my testing with 
the innostore backend you need to be aware of file growth. Search the Riak list 
for commentary on file growth with innostore, it may be a wiki writeup now, not 
sure. In it the author talks about how you can calculate how many files will be 
created on disk for every bucket that exists in Riak. This should definitely be 
a consideration when thinking about the file system you use to format your 
disks but more importantly the number of file descriptors your deployment 
operating system can maintain and how you can change those settings. The 
innostore backend configuration allows for a maximum number of fd's. Before I 
got that information I managed to halt a lot of machines during my testing 
phase.

Now lets talk solutions. For instance, if you want 1 second granularity you 
could write out one key with a name of YYYYMMDDHHMMSS in your stats bucket. 
This key would be a dump of the identical key in Redis. This would give you 60 
* 60 keys per bucket if you roll your keys up into hourly stats buckets and 
3600 keys would not terribly blow out a list keys function on any hourly 
bucket. In your case, if you do decide to go with descriptive key names I'm not 
sure what advantage link walking will get you. If your primary axis is time, 
then you could walk the tree in your application code without having to rely on 
links. An added advantage of not having one megalithic bucket is that 
replication and other Riak goodies are set on a per bucket basis. You could 
possibly tweak those settings as your data ages.  Lastly, if you are using the 
HTTP interface to Riak, might I suggest taking a look at the ElasticSearch 
project, http://github.com/elasticsearch/elasticsearch, which will index data 
you pass to it over HTTP as well. This could help you with deep internal data 
searching other than just time. Whatever you decide, do keep us updated on your 
progress.

I think the combination of Riak as a persistent  distributed data store and 
Redis as a volatile (not really due to it's VM/background save and Append only 
disk writing), blazingly fast and flexible cache mechanism make a very 
compelling solution to your specific use case. I would go so far as to say that 
perhaps Basho should consider embedding Redis as it's internal caching layer in 
Riak for just this type of problem. Riak could internally manage all the 
indexing and whatnot via triggers or some internal mechanic.

-Alexander

On Apr 12, 2010, at 11:56 AM, Colin wrote:

> Hi,
> 
> I am trying to figure the best model in Riak for my application. I
> have read & reread the docs, wiki and list threads but haven't put any
> possible solution to the test yet. I'd like your feedback to help me
> avoid dead end solutions if possible!
> 
> The basic idea is that I want to aggregate a large amount of realtime
> data and I want to easily retrieve this data using some time
> constraints (for example, all data for the past hour, day, etc). When
> I say large amount of data I mean in the order of hundreds items per
> second and each item should be stored individually. For this I will
> have a realtime items bucket.
> 
> Now, to create my "time" index I have a few ideas:
> 
> - create a "time capsules" buckets, with a new capsule created at some
> given interval. My application would use a periodic data flusher to
> accumulate and flush the data for my time capsule period, every second
> for example. These time capsules would only contain links to all items
> for this interval. I was actually thinking of two way links so for any
> realtime items I could also retrieve its corresponding time capsule.
> each time capsule would also be doubly-linked to allow walking forward
> or backward in time. New time capsules buckets could be created every
> day for example.
> 
> The problem I foresee with this structure is the possible number of
> links for each time capsules. In this example, I am thinking one
> capsule per second, with hundreds of realtime items resulting in
> hundreds of links in each capsule. From what I have read, it might not
> be the best of idea to deal with that many links and it will not scale
> if the rate grows to thousands of items per second.
> 
> - another idea would be to directly store my realtime items keys
> within each capsule instead of using links but still use links from
> realtime items to capsule and between capsules. This structure would
> allow me to gather all realtime items for a give timeframe by fetching
> the keys list within each capsule.
> 
> - yet another solution would be to use another external storage
> engine, like Redis, which supports sorted sets and store references to
> my realtime items. The problem with this is that I will not be able to
> easily launch map-reduce jobs to crunch data within a specific
> timeframe.
> 
> Finally, the end goal is for me to be able to run map-reduce jobs on
> my realtime data within a give timeframe to create other external
> indexes to my data, for fulltext search for example.
> 
> Any comments, hints, pointers will be appreciated!
> 
> Thanks,
> Colin
> 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: model for a time index on realtime data

Reply via email to