On Tue, Dec 21, 2010 at 7:25 PM, Jeremiah Peschka < [email protected]> wrote:
> I'm interested to see how it turns out. > > One question: Shouldn't you be able to determine equality from the vclocks? > Timestamps could run into issues if the clocks of different machines get out > of sync. They also don't allow for much granularity - there's always a > chance of collisions. > > How would this be any different from doing the same thing with bitcask or > InnoDB? Is it just your use of mmap to save memory space as opposed to > bitcask reading keys into RAM? > Yes. In the other project Anthony referenced I only stored the hash of the key (64 bits) and used mmap to let the the filesystem handle the caching of pages while highly optimizing the on-disk format. Also, for that project, the data constraints were a little different as updates only affected a portion of the stored value, which I edited in place by flipping a few bits on the mmap'ed region and then msync'ed occasionally. Our datasets tend to be large, accessed randomly, and have tight restrictions on access time. Using riak_core to distribute the data among multiple machines and do ring management works well, while writing the core data access/storage in C lets us optimize on disk format and work close to the hardware. (If I get time I'd like to rewrite the first project [a time-series database] as a backend for Riak by doing some similar API overloading.) Joel > Jeremiah Peschka > > > On Tue, Dec 21, 2010 at 8:02 PM, Anthony Molinaro < > [email protected]> wrote: > >> Hi, >> >> Just wondering if anyone had an opinions on this idea? >> >> Thanks, >> >> -Anthony >> >> On Fri, Dec 17, 2010 at 03:20:58PM -0800, Anthony Molinaro wrote: >> > Hi, >> > >> > I have a situation where I need to store sets of integers as >> efficiently >> > as possible. In order to do so I was going to implement a custom >> backend >> > and alter the semantics of get/put/delete, such that >> > >> > put (K, [{i,V}]) - insert one or more V into list at K >> > put (K, [{d,V}]) - delete one or more V from list at K >> > get (K) - returns list at K >> > delete (K) - delete entire list >> > >> > I'll probably keep the lists in sorted order via insertion sort so that >> > a quorum read can easily determine equality. >> > >> > My backend would probably be a linked in driver which mmaps a file so >> > I can keep this stuff as tightly packed in memory as possible. >> > >> > We did something similar using just riak_core for a different system, >> > but for that system we had a custom thrift frontend and didn't adhere >> > to the riak_kv api. >> > >> > So I was wondering, what sort of issues might I see with this sort of >> > use of riak? Is there another way I'm missing which would allow me >> > to efficiently do this? >> > >> > I expect in the end to have approximately 500 million lists of integers >> > with most in the ~10 integer range with a few up to the ~200 integer >> > range. Also, I'll probably have a timestamp associated with each >> > integer and want to use map-reduce to expire entries which are too old. >> > >> > Anway, I realize this might be an odd usage, and I can always fallback >> > to what we did before which was a thrift server on top up riak_core with >> > a custom datastore, but I figured if I could use the same API as riak_kv >> > with a custom backend it might save me a little time. >> > >> > -Anthony >> > >> > -- >> > ------------------------------------------------------------------------ >> > Anthony Molinaro <[email protected] >> > >> >> -- >> ------------------------------------------------------------------------ >> Anthony Molinaro <[email protected]> >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
