So I get that riak is not bucket aware. When you pass a bucket as an input in an m/r, as riak sifts through all the keys, how does riak isolate bucket specific keys? Are keys stored as /bucket/key internaly and there is a string comparison on split(key,'/') ? Or is there something else going on.
Thank you. On 2010-11-15, Kevin Smith <[email protected]> wrote: > We are giving some thought on how to do that. The main issues wrt to > bitcask's key listing performance is that bitcask is not bucket aware and > lacks the notion of secondary indices. Not being bucket aware means bitcask > has to examine all bucket/key pairs to find the ones related to a given > bucket. This isn't to say we won't address the problem but merely to point > out there's some engineering work required to solve the problem correctly. > > innostore is moderately bucket-aware right now so I've forked it > (http://github.com/kevsmith/innostore) and added bucket-aware key listing. > Based on some very basic testing I'm seeing 2.5x speed up in overall key > listing performance compared to the official version. I'm hoping the patch, > or a modified form of it, will make the next release. If you can handle inno > being a bit slower than bitcask and slightly more difficult to set up and > tune then this might be an option for you. > > I've done some basic vetting of the code but I want to emphasize this is a > prototype only and hasn't received anything even close to the normal amount > of testing we put into a release. Please keep this in mind if you decide to > use my forked repo. > > --Kevin > On Nov 15, 2010, at 11:57 AM, Greg Steffensen wrote: > >> Along these lines, are there any ideas floating around about how to speed >> up the listing of keys in a bucket? For the bitcask backend, it seems >> like an index of keys-by-bucket ought to be the kind of thing that could >> be stored in the hints files to speed this up without affecting >> performance for live reads and writes. >> >> Greg >> >> On Mon, Nov 15, 2010 at 11:46 AM, Sean Cribbs <[email protected]> wrote: >> This is possible with Riak's MapReduce but you will likely have increasing >> difficulty as your dataset grows, because of the impact of needing to list >> keys in a bucket and then eliminate data points you aren't interested in. >> In the longer term, there will be improvements to MapReduce such that if >> your keys are meaningful, you will be able to filter them more easily >> (without examining the data first). You might find Kevin Smith's overview >> enlightening: http://www.slideshare.net/hemulen/riak-mapred-preso >> >> Sean Cribbs <[email protected]> >> Developer Advocate >> Basho Technologies, Inc. >> http://basho.com/ >> >> On Nov 15, 2010, at 11:34 AM, Prometheus WillSurvive wrote: >> >>> Hi , >>> >>> We have a huge database (around 4 billion record - 30 TB) storing the >>> video watch infromation ie view count , comment , favorited etc. I want >>> to produce daily report for all videos view counts. It means I need to >>> look 2 day , today and yesterday so subtract yesterdey view count from >>> today view count so I can find the daliy impression. Our Fat DB team >>> doing this a few complex queries. I would like to ask you is this >>> possible with Riak map-reduce way . I want to make a demonstration to >>> the team to show this .. >>> >>> This is the scenario. We have similar data models for other thins. This >>> could be a start. >>> >>> We have 30xHP DL380 x32 Gig Ram Farm to test this scenario. >>> >>> Any riak map-reduce experienced member can show some idea on this.. I >>> guess. >>> >>> Regards >>> >>> Prometheus >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > -- Sent from my mobile device _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
