I seem to recall reading somewhere, but can't find it now, that when n = 3 three physical nodes is the minimum and 5 nodes is the recommended configuration.
Jeremiah Peschka Founder, Brent Ozar PLF, LLC On Thu, Jan 5, 2012 at 12:53 PM, Tim Robinson <t...@blackstag.com> wrote: > So with the original thread where with N=3 on 3 nodes. The developer > believed each node was getting a copy. When in fact 2 copies went to a > single node. So yes, there's redundancy and the "shock" value can go away > :) My apologies. > > That said, I have no ability to assess how much data space that is > wasting, but it seems like potentially 1/3 - correct? > > Another way to look at it, using the above noted case, is that I need to > double[1] the amount of hardware needed to achieve a single amount of > redundancy. > > [1] not specifically, but effectively. > > > -----Original Message----- > From: "Aphyr" <ap...@aphyr.com> > Sent: Thursday, January 5, 2012 1:29pm > To: "Tim Robinson" <t...@blackstag.com> > Cc: "Runar Jordahl" <runar.jord...@gmail.com>, riak-users@lists.basho.com > Subject: Re: Absolute consistency > > On 01/05/2012 12:12 PM, Tim Robinson wrote: > > Thank you for this info. I'm still somewhat confused. > > > > Why would anyone ever want 2 copies on one physical PC? Correct me if > > I am wrong, but part of the sales pitch for Riak is that the cost of > > hardware is lessened by distributing your data across a cluster of > > less expensive machines as opposed to having it all one reside on an > > enormous server with very little redundancy. > > > > The 2 copies of data on one physical PC provides no redundancy, but > > increases hardware costs quite a bit. > > > > Right? > > Because in the case you expressed shock over, the pigeonhole > principle makes it *impossible* to store three copies of information in > two places without overlap. The alternative is lying to you about the > replica semantics. That would be bad. > > In the second case I described, it's an artifact of a simplistic but > correct vnode sharding algorithm which uses the partion ID modulo node > count to assign the node for each partition. When N is not a multiple of > n, the last and the first (or second, etc, you do the math) partitions > can wind up on the same node. If you don't use even multiples of n/N, > the proportion of data that does overlap on one node is on the order of > 1/64 to 1/1024 of the keyspace. This is not a significant operational cost. > > This *does* reduce fault tolerance: losing those two "special" nodes > (but not two arbitrary nodes) can destroy those special keys even though > they were stored with N=3. As the probability of losing two *particular* > nodes simultaneously compares favorably with the probability of losing > *any three* nodes simultaneously, I haven't been that concerned over it. > It takes roughly six hours for me to allocate a new machine and restore > the destroyed node's backup to it. Anecdotally, I think you're more > likely to see *cluster* failure than *dual node* failure in a small > distributed system, but that's a long story. > > The riak team has been aware of this since at least Jun 2010 > (https://issues.basho.com/show_bug.cgi?id=228), and there are > operational workarounds involving target_n_val. As I understand it, > solving the key distribution problem is... nontrivial. > > --Kyle > > > Tim Robinson > > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com