Hi Riak Users,

there are 3 data-structures which I can think of for my use case, which
is a huge hash table (a bucket with n-entries) which gets 10 to 1 reads
over writes. Every hash in that table has 3 other types of indices I
want to access it upon:

1) one index which is the user owning the hash (and the content stored
under that hash)
2) 10 indices which are attributes out of ~500 total attributes ->
accessing one of those attributes will lead to a huge result set (n/50 keys)
3) a date after which the element has to be deleted(but there are some
actions which have to be performed before they are deleted)

index 1) is accessed ~ once a week per user, indices 2) are accessed
even less frequent and index 3) is checked daily.

So this indices will lead to big results [ 3) is expected to be rather
"small" ] which are rarely accessed.

Option 1: use 2i indices on every element in that hash table
Option 2: create a bucket (with property allow_mult=true) with the
indices 1)-3) as a key and a link to the corresponding hash (so for
every hash we add 13 entries to that bucket -> size of that bucket will
be 13*n)
Option 3: use the new map data type
Option 4: ??

Storage wise option1 sounds most compelling, but what about the
performance? (In my use case the performance to access those keys is not
so important [they will only accessed by background processes] - but the
stress on the cluster due to this access should be as small as possible)

Which option would you recommend? (And why?)

Thanks in advance,
Ralf



_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to