--- Jeremiah Peschka - Founder, Brent Ozar Unlimited MCITP: SQL Server 2008, MVP Cloudera Certified Developer for Apache Hadoop
On Thu, Mar 14, 2013 at 2:39 PM, Kevin Burton <[email protected]>wrote: > One more question. Then how many physical nodes can go down before I loose > data? Again assuming that I have 4 physical nodes. Assuming 4 nodes and n = 3, you could have two nodes fail (maybe) before there's a problem. Ideally you'll have a minimum of 5 nodes in production. > In a crude sense it seems > that it would depend on how the replicas are spread out. But if you lost > the > two physical nodes that could possibly take out enough of the replicas to > cause data loss. Right? Also if you had more physical nodes although the > probability is smaller that you could lose the critical nodes you could > still lose two critical nodes and lose data. Am I understanding the > trade-offs correctly? > Yes. I believe there are changes present in Riak 1.3 that make it less likely that multiple vnodes storing the same bucket/key combination are on the same physical nodes. If you need more paranoia, add more nodes. > > -----Original Message----- > From: Alexander Sicular [mailto:[email protected]] > Sent: Thursday, March 14, 2013 4:17 PM > To: Kevin Burton > Cc: 'Mark Phillips'; [email protected] > Subject: Re: Bigger data than disk space? > > You have to think at the cluster level. If you have 10 GB of data and if > your replication factor is three then your total data across the cluster > will be > > 10 GB x 3 replicas = 30 GB across the cluster > > Now, if you have four physical machines in your cluster each will be > responsible for 1/4 that data. > > 0.25 x 30 GB = 7.5 GB > > That is because the vnodes are evenly divided amongst physical machines in > the cluster. > > -Alexander Sicular > > @siculars > > On Mar 14, 2013, at 5:08 PM, "Kevin Burton" <[email protected]> > wrote: > > > Then that is not quite as bad but still if I have 10 GB of data and to > > support replication that requires 30 GB of disk space, what if I only > > have > > 20 GB of disk space per physical node? > > > > -----Original Message----- > > From: Mark Phillips [mailto:[email protected]] > > Sent: Thursday, March 14, 2013 4:05 PM > > To: Kevin Burton > > Cc: Alexander Sicular; [email protected] > > Subject: Re: Bigger data than disk space? > > > > Kevin, > > > > On Thu, Mar 14, 2013 at 1:56 PM, Kevin Burton > > <[email protected]> > > wrote: > >> So that is what I am missing. If each vnode keeps an entire copy of > >> my data and I have 4 physical node then there are 16 vnodes per > >> physical node. That would mean I have the data replicated 16 times > >> per physical node. 10 GB turns into 160GB etc. Right? So won’t I run out > of disk space? > >> > > > > Your raw data set is replicated 3 times by default. Three different > > vnodes of your total (by default 64) will be responsible for each > > replica. So, 10GB raw = 30GB replicated. > > > > Mark > > > >> > >> > >> From: Alexander Sicular [mailto:[email protected]] > >> Sent: Thursday, March 14, 2013 3:51 PM > >> > >> > >> To: Kevin Burton > >> Cc: [email protected] > >> Subject: Re: Bigger data than disk space? > >> > >> > >> > >> Each vnode keeps _an entire copy_ of your data. There is no striping, > >> which I think you are conflating with RAID. Default replication (also > >> configured in etc/app.config) is set to three. In which case, three > >> entire copies of your data are kept on three different vnodes and if > >> you indeed have five physical nodes in your cluster you are > >> guaranteed to have each of those three vnodes on different physical > machines. > >> > >> > >> -Alexander Sicular > >> > >> > >> > >> @siculars > >> > >> > >> > >> On Mar 14, 2013, at 4:42 PM, "Kevin Burton" > >> <[email protected]> > >> wrote: > >> > >> > >> > >> Thank you. Let me get it straight. I have a 4 node cluster (4 > >> physical machines). If I have not made any changes to the ring size > >> then I have > >> 16 > >> (64/4) vnodes. Each physical node stores the actual data (the value) > >> of about ¼ of the data size. So when querying the data with a key > >> given the number of vnodes it can be determined which physical > >> machine the > > data is on. > >> There must be enough redundancy built in so that if one or more of > >> the physical machines go down the remaining physical machines can > >> reconstruct the values lost by the lost vnodes. Correct so far? Now > >> where does replication some in? The documentation indicates that > >> there are 3 copies of the data (default) made. How is this changed > >> and how can this replication of the data be taken advantage of? > >> > >> > >> > >> From: Alexander Sicular [mailto:[email protected]] > >> Sent: Thursday, March 14, 2013 3:28 PM > >> To: Kevin Burton > >> Cc: [email protected] > >> Subject: Re: Bigger data than disk space? > >> > >> > >> > >> Hi Kevin, > >> > >> > >> > >> The Riak distribution model is not based on "buckets" but rather the > >> hash of the bucket/key combination. That hash (and associated data) > >> is then allocated against a "vnode". A vnode, in turn, is one of n > >> where n is the ring_creation_size (default is 64, modify in > etc/app.config). > >> Each physical machine in a Riak cluster claims an equal share of the > >> ring. For example, a cluster with five machines (the recommended > >> minimum for a production > >> cluster) and the default ring_creation_size will have 64/5 vnodes per > >> physical machine (not sure if they round down or up but all machines > >> will have about the same number of vnodes). What you would do to make > >> more data available is either add a machine to the cluster whose > >> available disk space is equal or greater than the cluster member with > >> the least amount of total space or increase the space on all machines > > already in the cluster. > >> > >> > >> > >> tl;dr add a machine to your cluster. > >> > >> > >> > >> > >> -Alexander Sicular > >> > >> > >> > >> @siculars > >> > >> > >> > >> On Mar 14, 2013, at 3:41 PM, Kevin Burton <[email protected]> > > wrote: > >> > >> > >> > >> > >> I am relatively new to Riak so forgive me if this has been asked > >> before. I have a very thin understanding of a Riak cluster and > >> understand somewhat about replication. In planning I foresee a time > >> when the amount of data exceeds the disk space that is available to a > >> single node. What facilities are there to essentially “split” a > >> bucket across several servers? How is this handled? > >> > >> _______________________________________________ > >> riak-users mailing list > >> [email protected] > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > >> > >> > >> > >> _______________________________________________ > >> riak-users mailing list > >> [email protected] > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > > > > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > --- Jeremiah Peschka - Founder, Brent Ozar Unlimited MCITP: SQL Server 2008, MVP Cloudera Certified Developer for Apache Hadoop On Thu, Mar 14, 2013 at 2:39 PM, Kevin Burton <[email protected]>wrote: > One more question. Then how many physical nodes can go down before I loose > data? Again assuming that I have 4 physical nodes. In a crude sense it > seems > that it would depend on how the replicas are spread out. But if you lost > the > two physical nodes that could possibly take out enough of the replicas to > cause data loss. Right? Also if you had more physical nodes although the > probability is smaller that you could lose the critical nodes you could > still lose two critical nodes and lose data. Am I understanding the > trade-offs correctly? > > -----Original Message----- > From: Alexander Sicular [mailto:[email protected]] > Sent: Thursday, March 14, 2013 4:17 PM > To: Kevin Burton > Cc: 'Mark Phillips'; [email protected] > Subject: Re: Bigger data than disk space? > > You have to think at the cluster level. If you have 10 GB of data and if > your replication factor is three then your total data across the cluster > will be > > 10 GB x 3 replicas = 30 GB across the cluster > > Now, if you have four physical machines in your cluster each will be > responsible for 1/4 that data. > > 0.25 x 30 GB = 7.5 GB > > That is because the vnodes are evenly divided amongst physical machines in > the cluster. > > -Alexander Sicular > > @siculars > > On Mar 14, 2013, at 5:08 PM, "Kevin Burton" <[email protected]> > wrote: > > > Then that is not quite as bad but still if I have 10 GB of data and to > > support replication that requires 30 GB of disk space, what if I only > > have > > 20 GB of disk space per physical node? > > > > -----Original Message----- > > From: Mark Phillips [mailto:[email protected]] > > Sent: Thursday, March 14, 2013 4:05 PM > > To: Kevin Burton > > Cc: Alexander Sicular; [email protected] > > Subject: Re: Bigger data than disk space? > > > > Kevin, > > > > On Thu, Mar 14, 2013 at 1:56 PM, Kevin Burton > > <[email protected]> > > wrote: > >> So that is what I am missing. If each vnode keeps an entire copy of > >> my data and I have 4 physical node then there are 16 vnodes per > >> physical node. That would mean I have the data replicated 16 times > >> per physical node. 10 GB turns into 160GB etc. Right? So won’t I run out > of disk space? > >> > > > > Your raw data set is replicated 3 times by default. Three different > > vnodes of your total (by default 64) will be responsible for each > > replica. So, 10GB raw = 30GB replicated. > > > > Mark > > > >> > >> > >> From: Alexander Sicular [mailto:[email protected]] > >> Sent: Thursday, March 14, 2013 3:51 PM > >> > >> > >> To: Kevin Burton > >> Cc: [email protected] > >> Subject: Re: Bigger data than disk space? > >> > >> > >> > >> Each vnode keeps _an entire copy_ of your data. There is no striping, > >> which I think you are conflating with RAID. Default replication (also > >> configured in etc/app.config) is set to three. In which case, three > >> entire copies of your data are kept on three different vnodes and if > >> you indeed have five physical nodes in your cluster you are > >> guaranteed to have each of those three vnodes on different physical > machines. > >> > >> > >> -Alexander Sicular > >> > >> > >> > >> @siculars > >> > >> > >> > >> On Mar 14, 2013, at 4:42 PM, "Kevin Burton" > >> <[email protected]> > >> wrote: > >> > >> > >> > >> Thank you. Let me get it straight. I have a 4 node cluster (4 > >> physical machines). If I have not made any changes to the ring size > >> then I have > >> 16 > >> (64/4) vnodes. Each physical node stores the actual data (the value) > >> of about ¼ of the data size. So when querying the data with a key > >> given the number of vnodes it can be determined which physical > >> machine the > > data is on. > >> There must be enough redundancy built in so that if one or more of > >> the physical machines go down the remaining physical machines can > >> reconstruct the values lost by the lost vnodes. Correct so far? Now > >> where does replication some in? The documentation indicates that > >> there are 3 copies of the data (default) made. How is this changed > >> and how can this replication of the data be taken advantage of? > >> > >> > >> > >> From: Alexander Sicular [mailto:[email protected]] > >> Sent: Thursday, March 14, 2013 3:28 PM > >> To: Kevin Burton > >> Cc: [email protected] > >> Subject: Re: Bigger data than disk space? > >> > >> > >> > >> Hi Kevin, > >> > >> > >> > >> The Riak distribution model is not based on "buckets" but rather the > >> hash of the bucket/key combination. That hash (and associated data) > >> is then allocated against a "vnode". A vnode, in turn, is one of n > >> where n is the ring_creation_size (default is 64, modify in > etc/app.config). > >> Each physical machine in a Riak cluster claims an equal share of the > >> ring. For example, a cluster with five machines (the recommended > >> minimum for a production > >> cluster) and the default ring_creation_size will have 64/5 vnodes per > >> physical machine (not sure if they round down or up but all machines > >> will have about the same number of vnodes). What you would do to make > >> more data available is either add a machine to the cluster whose > >> available disk space is equal or greater than the cluster member with > >> the least amount of total space or increase the space on all machines > > already in the cluster. > >> > >> > >> > >> tl;dr add a machine to your cluster. > >> > >> > >> > >> > >> -Alexander Sicular > >> > >> > >> > >> @siculars > >> > >> > >> > >> On Mar 14, 2013, at 3:41 PM, Kevin Burton <[email protected]> > > wrote: > >> > >> > >> > >> > >> I am relatively new to Riak so forgive me if this has been asked > >> before. I have a very thin understanding of a Riak cluster and > >> understand somewhat about replication. In planning I foresee a time > >> when the amount of data exceeds the disk space that is available to a > >> single node. What facilities are there to essentially “split” a > >> bucket across several servers? How is this handled? > >> > >> _______________________________________________ > >> riak-users mailing list > >> [email protected] > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > >> > >> > >> > >> _______________________________________________ > >> riak-users mailing list > >> [email protected] > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > > > > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
