On Wed, Jan 11, 2012 at 9:14 PM, Joseph Blomstedt <j...@basho.com> wrote: > > Some thoughts to ponder: > > 1. Do you allow multiple clients to write to Riak at the same time? > With concurrent writers, "atomic" can mean multiple things. Do you > want linearizability? Do you want one writer to fail?
In our particular case we have redundant feeds with upstream processing that will provide a unique key per item. So we expect two concurrent writers and want one to fail, the other to succeed and be indexed. There's a bit of messiness involved because some (few) of the items will be updates to existing data. > Is optimistic > concurrency control or MVCC your solution? You could route everything > though a single writer, but then you introduce a > single-point-of-failure (SPOF). Adding an SPOF in-front of a highly > available system is non-ideal. Or, perhaps a single writer with leader > election / failover? The point of the design is to avoid any SPOF. The upstream data comes from multiple sources, so the redundant feeds are expected to cover individual source failures to the extent possible, letting the DB drop the duplicates but add its own internal redundancy for storage. > 2. What about write failure? In Riak, a write failure does not mean > the value won't later show-up in a read. If you issue a PW=3 write, it > may fail because it succeeded to write to 1 replica, but not the other We expect one to fail. If it fails for any reason other than the key already existing, the writer client can queue and retry within limits (its a feed, not interactive). > In general, distributed consistency is non-trivial. Even master/slave > systems have choices to make. Synchronous vs asynchronous replication. > ElasticSearch is synchronous (at least to secondary RAM), while > MongoDB is asynchronous. If you have multiple slaves, what consistency > guarantees are there between all of them? If the master crashes during > a write that was replicated to some but not all slaves, is it possible > to get different values on a read if different subset of slaves crash > as well? For the strongest replication guarantees, you end up with > protocols with higher latency and lower availability guarantees. It's > always a tradeoff game. I don't expect to trust any of these schemes 100%. We will have another instance of the cluster in another location and a tool that compares recently created keys between them (one of the reasons for wanting efficient range queries). We are still kicking around the relative reliability and other tradeoffs of trying to inject all 4 feeds into each DB vs. the two local copies vs. crisscrossed. Crisscrossed will probably win. > As Jon mentioned, stronger consistency is a research area for 2012. > While CAP dictates that you can't have C/A/P at once, there's no > reason you can't have a product that provides both AP requests and CP > requests. Perhaps there will be more to discuss on that point later on > this year. Yes, even understanding that you can't guarantee both at once, you could let the client decide which it wants. Some things can wait. -- Les Mikesell lesmikes...@gmail.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com