Hi Izhar, Fredrik's suggestion for using 2i sounds like a good place to start and will keep your riak install as simple as possible. If you name your keys as <userid>:<subkey> and store them in the same bucket, you can use 2I to retrieve ranges of primary keys as a range from <userid>: to <userid>; (using the colon/semicolon is just a trick to make the ranging work nicely). 2I queries do not work across buckets, and to use 2I you are currently restricted to eleveldb as a backend.
Here's a small example using the riak console - apologies for not having a small client library based snippet. It creates 3 objects, two for user k1, one for user k2 and finds them using 2I on the primary key. ([email protected])1> {ok,C}=riak:local_client(). {ok,{riak_client,'[email protected]',undefined}} ([email protected])14> C:put(riak_object:new(<<"b">>,<<"k1:s1">>,<<"key 1 subkey 1">>)). ok ([email protected])13> C:put(riak_object:new(<<"b">>,<<"k1:s2">>,<<"key 1 subkey 2">>)). ok ([email protected])15> C:put(riak_object:new(<<"b">>,<<"k2:s1">>,<<"key 2 subkey 1">>)). ok ([email protected])16> C:get_index(<<"b">>, {range, <<"$key">>, <<"k1:">>, <<"k1;">>}). {ok,[<<"k1:s1">>,<<"k1:s2">>]} Here's a little more info on the million-bucket option. In 1.0, list_keys no longer blocks vnodes (unless you have async_vnodes disabled). As Kresten said, bucket properties are only stored in the ring once they are modified from the default. Buckets that use the default properties do not take up space. It is possible to modify the default bucket properties if you need to change it for all buckets by setting default_bucket_props in the riak_core section of app.config, but that does require more thought/testing when doing upgrades. It is up to the backend implementation to decide how to handle bucket/key splits. Bitcask and eleveldb include the bucket as part of the key and stores all bucket/keys for a partition together, innostore creates a separate table so although it can restrict how many filehandles it uses performance would suffer due to constant open/closes of the table. For listing keys, bitcask will have to traverse all the keys (as internally it uses a hashtable for bucket/keys) but eleveldb stores them ordered so can do a quick retrieval (similar to how we do it with 2I). So, If you don't override bucket properties then the eleveldb backend is the best choice, but 200M buckets is many orders of magnitude more than we have tested, so your mileage may vary. BR, Jon On Tue, Nov 15, 2011 at 10:34 AM, Alexander Sicular <[email protected]>wrote: > Refresh my memory, does leveldb open new files for each bucket? I'm > thinking there may be some file descriptor penalty for this many buckets. > > Otherwise you could set your default bucket properties to what you would > want for these user buckets and then change properties if need be for a > handful of other buckets you may need in your application. > > Cheers, > > > Alexander Sicular > @siculars > http://sicuars.posterous.com > > On Tuesday, November 15, 2011 at 12:17 PM, Kresten Krab Thorup wrote: > > > On Nov 15, 2011, at 5:07 PM, Izhar Ravid wrote: > > Assuming I wish to store user information for some 200M users, and create > a bucket per user. Each user bucket will contain several dozen objects. > - Will list-keys on such a user bucket be a reasonable action? > > When using leveldb as the backend, list-keys is isolated to a bucket. As > far as I can see, list_keys is a "blocking operation" meaning that a vnode > doing list_keys can not respond to other requests while it is processing. > So; in this case ... using leveldb and ~dozens of keys per bucket it does > sound reasonable. > > - Is list-keys isolated to a bucket? > - Is it reasonable to expect Riak to hold 200M buckets? > > I'd say that it depends on your need to configure bucket properties. > > Riak stores bucket properties in "the ring", which is a riak cluster's > distributed state management. > > The ring is not designed to have 200M entries as far as I know; and if you > need to set bucket properties then the ring needs to hold those values (if > you use the default properties it requires no state). > > So, ... the answer is probably no in this case; 200M buckets is not > reasonable, because you will likely eventually want to define properties > for these buckets. > > Kresten > > > Kind regards, > Izhar. > > _______________________________________________ > riak-users mailing list > [email protected]<mailto:[email protected]<[email protected]> > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > -- Jon Meredith Platform Engineering Manager Basho Technologies, Inc. [email protected]
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
