Hi Riak Users, there are 3 data-structures which I can think of for my use case, which is a huge hash table (a bucket with n-entries) which gets 10 to 1 reads over writes. Every hash in that table has 3 other types of indices I want to access it upon:
1) one index which is the user owning the hash (and the content stored under that hash) 2) 10 indices which are attributes out of ~500 total attributes -> accessing one of those attributes will lead to a huge result set (n/50 keys) 3) a date after which the element has to be deleted(but there are some actions which have to be performed before they are deleted) index 1) is accessed ~ once a week per user, indices 2) are accessed even less frequent and index 3) is checked daily. So this indices will lead to big results [ 3) is expected to be rather "small" ] which are rarely accessed. Option 1: use 2i indices on every element in that hash table Option 2: create a bucket (with property allow_mult=true) with the indices 1)-3) as a key and a link to the corresponding hash (so for every hash we add 13 entries to that bucket -> size of that bucket will be 13*n) Option 3: use the new map data type Option 4: ?? Storage wise option1 sounds most compelling, but what about the performance? (In my use case the performance to access those keys is not so important [they will only accessed by background processes] - but the stress on the cluster due to this access should be as small as possible) Which option would you recommend? (And why?) Thanks in advance, Ralf _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
