If you lost two nodes in a four node cluster you would lose _some_ set of data. Exactly the set of data would be that data whose three vnodes were on those two machines. Someone at Basho probably knows exactly what percentage of data with a replica set of three have coverage on only two physical machines in a four node cluster. That is precisely why the minimum production cluster size is five.
If you lost two nodes in a five node cluster you should be ok. -Alexander Sicular @siculars On Mar 14, 2013, at 5:39 PM, "Kevin Burton" <[email protected]> wrote: > One more question. Then how many physical nodes can go down before I loose > data? Again assuming that I have 4 physical nodes. In a crude sense it seems > that it would depend on how the replicas are spread out. But if you lost the > two physical nodes that could possibly take out enough of the replicas to > cause data loss. Right? Also if you had more physical nodes although the > probability is smaller that you could lose the critical nodes you could > still lose two critical nodes and lose data. Am I understanding the > trade-offs correctly? > > -----Original Message----- > From: Alexander Sicular [mailto:[email protected]] > Sent: Thursday, March 14, 2013 4:17 PM > To: Kevin Burton > Cc: 'Mark Phillips'; [email protected] > Subject: Re: Bigger data than disk space? > > You have to think at the cluster level. If you have 10 GB of data and if > your replication factor is three then your total data across the cluster > will be > > 10 GB x 3 replicas = 30 GB across the cluster > > Now, if you have four physical machines in your cluster each will be > responsible for 1/4 that data. > > 0.25 x 30 GB = 7.5 GB > > That is because the vnodes are evenly divided amongst physical machines in > the cluster. > > -Alexander Sicular > > @siculars > > On Mar 14, 2013, at 5:08 PM, "Kevin Burton" <[email protected]> > wrote: > >> Then that is not quite as bad but still if I have 10 GB of data and to >> support replication that requires 30 GB of disk space, what if I only >> have >> 20 GB of disk space per physical node? >> >> -----Original Message----- >> From: Mark Phillips [mailto:[email protected]] >> Sent: Thursday, March 14, 2013 4:05 PM >> To: Kevin Burton >> Cc: Alexander Sicular; [email protected] >> Subject: Re: Bigger data than disk space? >> >> Kevin, >> >> On Thu, Mar 14, 2013 at 1:56 PM, Kevin Burton >> <[email protected]> >> wrote: >>> So that is what I am missing. If each vnode keeps an entire copy of >>> my data and I have 4 physical node then there are 16 vnodes per >>> physical node. That would mean I have the data replicated 16 times >>> per physical node. 10 GB turns into 160GB etc. Right? So won’t I run out > of disk space? >>> >> >> Your raw data set is replicated 3 times by default. Three different >> vnodes of your total (by default 64) will be responsible for each >> replica. So, 10GB raw = 30GB replicated. >> >> Mark >> >>> >>> >>> From: Alexander Sicular [mailto:[email protected]] >>> Sent: Thursday, March 14, 2013 3:51 PM >>> >>> >>> To: Kevin Burton >>> Cc: [email protected] >>> Subject: Re: Bigger data than disk space? >>> >>> >>> >>> Each vnode keeps _an entire copy_ of your data. There is no striping, >>> which I think you are conflating with RAID. Default replication (also >>> configured in etc/app.config) is set to three. In which case, three >>> entire copies of your data are kept on three different vnodes and if >>> you indeed have five physical nodes in your cluster you are >>> guaranteed to have each of those three vnodes on different physical > machines. >>> >>> >>> -Alexander Sicular >>> >>> >>> >>> @siculars >>> >>> >>> >>> On Mar 14, 2013, at 4:42 PM, "Kevin Burton" >>> <[email protected]> >>> wrote: >>> >>> >>> >>> Thank you. Let me get it straight. I have a 4 node cluster (4 >>> physical machines). If I have not made any changes to the ring size >>> then I have >>> 16 >>> (64/4) vnodes. Each physical node stores the actual data (the value) >>> of about ¼ of the data size. So when querying the data with a key >>> given the number of vnodes it can be determined which physical >>> machine the >> data is on. >>> There must be enough redundancy built in so that if one or more of >>> the physical machines go down the remaining physical machines can >>> reconstruct the values lost by the lost vnodes. Correct so far? Now >>> where does replication some in? The documentation indicates that >>> there are 3 copies of the data (default) made. How is this changed >>> and how can this replication of the data be taken advantage of? >>> >>> >>> >>> From: Alexander Sicular [mailto:[email protected]] >>> Sent: Thursday, March 14, 2013 3:28 PM >>> To: Kevin Burton >>> Cc: [email protected] >>> Subject: Re: Bigger data than disk space? >>> >>> >>> >>> Hi Kevin, >>> >>> >>> >>> The Riak distribution model is not based on "buckets" but rather the >>> hash of the bucket/key combination. That hash (and associated data) >>> is then allocated against a "vnode". A vnode, in turn, is one of n >>> where n is the ring_creation_size (default is 64, modify in > etc/app.config). >>> Each physical machine in a Riak cluster claims an equal share of the >>> ring. For example, a cluster with five machines (the recommended >>> minimum for a production >>> cluster) and the default ring_creation_size will have 64/5 vnodes per >>> physical machine (not sure if they round down or up but all machines >>> will have about the same number of vnodes). What you would do to make >>> more data available is either add a machine to the cluster whose >>> available disk space is equal or greater than the cluster member with >>> the least amount of total space or increase the space on all machines >> already in the cluster. >>> >>> >>> >>> tl;dr add a machine to your cluster. >>> >>> >>> >>> >>> -Alexander Sicular >>> >>> >>> >>> @siculars >>> >>> >>> >>> On Mar 14, 2013, at 3:41 PM, Kevin Burton <[email protected]> >> wrote: >>> >>> >>> >>> >>> I am relatively new to Riak so forgive me if this has been asked >>> before. I have a very thin understanding of a Riak cluster and >>> understand somewhat about replication. In planning I foresee a time >>> when the amount of data exceeds the disk space that is available to a >>> single node. What facilities are there to essentially “split” a >>> bucket across several servers? How is this handled? >>> >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >> > > _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
