---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Mar 14, 2013 at 2:39 PM, Kevin Burton <[email protected]>wrote:

> One more question. Then how many physical nodes can go down before I loose
> data? Again assuming that I have 4 physical nodes.


Assuming 4 nodes and n = 3, you could have two nodes fail (maybe) before
there's a problem. Ideally you'll have a minimum of 5 nodes in production.


> In a crude sense it seems
> that it would depend on how the replicas are spread out. But if you lost
> the
> two physical nodes that could possibly take out enough of the replicas to
> cause data loss. Right? Also if you had more physical nodes although the
> probability is smaller that you could lose the critical nodes you could
> still lose two critical nodes and lose data. Am I understanding the
> trade-offs correctly?
>

Yes. I believe there are changes present in Riak 1.3 that make it less
likely that multiple vnodes storing the same bucket/key combination are on
the same physical nodes. If you need more paranoia, add more nodes.


>
> -----Original Message-----
> From: Alexander Sicular [mailto:[email protected]]
> Sent: Thursday, March 14, 2013 4:17 PM
> To: Kevin Burton
> Cc: 'Mark Phillips'; [email protected]
> Subject: Re: Bigger data than disk space?
>
> You have to think at the cluster level. If you have 10 GB of data and if
> your replication factor is three then your total data across the cluster
> will be
>
> 10 GB x 3 replicas = 30 GB across the cluster
>
> Now, if you have four physical machines in your cluster each will be
> responsible for 1/4 that data.
>
> 0.25 x 30 GB = 7.5 GB
>
> That is because the vnodes are evenly divided amongst physical machines in
> the cluster.
>
> -Alexander Sicular
>
> @siculars
>
> On Mar 14, 2013, at 5:08 PM, "Kevin Burton" <[email protected]>
> wrote:
>
> > Then that is not quite as bad but still if I have 10 GB of data and to
> > support replication that requires 30 GB of disk space, what if I only
> > have
> > 20 GB of disk space per physical node?
> >
> > -----Original Message-----
> > From: Mark Phillips [mailto:[email protected]]
> > Sent: Thursday, March 14, 2013 4:05 PM
> > To: Kevin Burton
> > Cc: Alexander Sicular; [email protected]
> > Subject: Re: Bigger data than disk space?
> >
> > Kevin,
> >
> > On Thu, Mar 14, 2013 at 1:56 PM, Kevin Burton
> > <[email protected]>
> > wrote:
> >> So that is what I am missing. If each vnode keeps an entire copy of
> >> my data and I have 4 physical node then there are 16 vnodes per
> >> physical node. That would mean I have the data replicated 16 times
> >> per physical node. 10 GB turns into 160GB etc. Right? So won’t I run out
> of disk space?
> >>
> >
> > Your raw data set is replicated 3 times by default. Three different
> > vnodes of your total (by default 64) will be responsible for each
> > replica. So, 10GB raw  = 30GB replicated.
> >
> > Mark
> >
> >>
> >>
> >> From: Alexander Sicular [mailto:[email protected]]
> >> Sent: Thursday, March 14, 2013 3:51 PM
> >>
> >>
> >> To: Kevin Burton
> >> Cc: [email protected]
> >> Subject: Re: Bigger data than disk space?
> >>
> >>
> >>
> >> Each vnode keeps _an entire copy_ of your data. There is no striping,
> >> which I think you are conflating with RAID. Default replication (also
> >> configured in etc/app.config) is set to three. In which case, three
> >> entire copies of your data are kept on three different vnodes and if
> >> you indeed have five physical nodes in your cluster you are
> >> guaranteed to have each of those three vnodes on different physical
> machines.
> >>
> >>
> >> -Alexander Sicular
> >>
> >>
> >>
> >> @siculars
> >>
> >>
> >>
> >> On Mar 14, 2013, at 4:42 PM, "Kevin Burton"
> >> <[email protected]>
> >> wrote:
> >>
> >>
> >>
> >> Thank you. Let me get it straight. I have a 4 node cluster (4
> >> physical machines). If I have not made any changes to the ring size
> >> then I have
> >> 16
> >> (64/4) vnodes. Each physical node stores the actual data (the value)
> >> of about ¼ of the data size. So when querying the data with a key
> >> given the number of vnodes it can be determined which physical
> >> machine the
> > data is on.
> >> There must be enough redundancy built in so that if one or more of
> >> the physical machines go down the remaining physical machines can
> >> reconstruct the values lost by the lost vnodes. Correct so far? Now
> >> where does replication some in? The documentation indicates that
> >> there are 3 copies of the data (default) made. How is this changed
> >> and how can this replication of the data be taken advantage of?
> >>
> >>
> >>
> >> From: Alexander Sicular [mailto:[email protected]]
> >> Sent: Thursday, March 14, 2013 3:28 PM
> >> To: Kevin Burton
> >> Cc: [email protected]
> >> Subject: Re: Bigger data than disk space?
> >>
> >>
> >>
> >> Hi Kevin,
> >>
> >>
> >>
> >> The Riak distribution model is not based on "buckets" but rather the
> >> hash of the bucket/key combination. That hash (and associated data)
> >> is then allocated against a "vnode". A vnode, in turn, is one of n
> >> where n is the ring_creation_size (default is 64, modify in
> etc/app.config).
> >> Each physical machine in a Riak cluster claims an equal share of the
> >> ring. For example, a cluster with five machines (the recommended
> >> minimum for a production
> >> cluster) and the default ring_creation_size will have 64/5 vnodes per
> >> physical machine (not sure if they round down or up but all machines
> >> will have about the same number of vnodes). What you would do to make
> >> more data available is either add a machine to the cluster whose
> >> available disk space is equal or greater than the cluster member with
> >> the least amount of total space or increase the space on all machines
> > already in the cluster.
> >>
> >>
> >>
> >> tl;dr add a machine to your cluster.
> >>
> >>
> >>
> >>
> >> -Alexander Sicular
> >>
> >>
> >>
> >> @siculars
> >>
> >>
> >>
> >> On Mar 14, 2013, at 3:41 PM, Kevin Burton <[email protected]>
> > wrote:
> >>
> >>
> >>
> >>
> >> I am relatively new to Riak so forgive me if this has been asked
> >> before. I have a very thin understanding of a Riak cluster and
> >> understand somewhat about replication. In planning I foresee a time
> >> when the amount of data exceeds the disk space that is available to a
> >> single node. What facilities are there to essentially “split” a
> >> bucket across several servers? How is this handled?
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> [email protected]
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> [email protected]
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >
>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>


---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Mar 14, 2013 at 2:39 PM, Kevin Burton <[email protected]>wrote:

> One more question. Then how many physical nodes can go down before I loose
> data? Again assuming that I have 4 physical nodes. In a crude sense it
> seems
> that it would depend on how the replicas are spread out. But if you lost
> the
> two physical nodes that could possibly take out enough of the replicas to
> cause data loss. Right? Also if you had more physical nodes although the
> probability is smaller that you could lose the critical nodes you could
> still lose two critical nodes and lose data. Am I understanding the
> trade-offs correctly?
>
> -----Original Message-----
> From: Alexander Sicular [mailto:[email protected]]
> Sent: Thursday, March 14, 2013 4:17 PM
> To: Kevin Burton
> Cc: 'Mark Phillips'; [email protected]
> Subject: Re: Bigger data than disk space?
>
> You have to think at the cluster level. If you have 10 GB of data and if
> your replication factor is three then your total data across the cluster
> will be
>
> 10 GB x 3 replicas = 30 GB across the cluster
>
> Now, if you have four physical machines in your cluster each will be
> responsible for 1/4 that data.
>
> 0.25 x 30 GB = 7.5 GB
>
> That is because the vnodes are evenly divided amongst physical machines in
> the cluster.
>
> -Alexander Sicular
>
> @siculars
>
> On Mar 14, 2013, at 5:08 PM, "Kevin Burton" <[email protected]>
> wrote:
>
> > Then that is not quite as bad but still if I have 10 GB of data and to
> > support replication that requires 30 GB of disk space, what if I only
> > have
> > 20 GB of disk space per physical node?
> >
> > -----Original Message-----
> > From: Mark Phillips [mailto:[email protected]]
> > Sent: Thursday, March 14, 2013 4:05 PM
> > To: Kevin Burton
> > Cc: Alexander Sicular; [email protected]
> > Subject: Re: Bigger data than disk space?
> >
> > Kevin,
> >
> > On Thu, Mar 14, 2013 at 1:56 PM, Kevin Burton
> > <[email protected]>
> > wrote:
> >> So that is what I am missing. If each vnode keeps an entire copy of
> >> my data and I have 4 physical node then there are 16 vnodes per
> >> physical node. That would mean I have the data replicated 16 times
> >> per physical node. 10 GB turns into 160GB etc. Right? So won’t I run out
> of disk space?
> >>
> >
> > Your raw data set is replicated 3 times by default. Three different
> > vnodes of your total (by default 64) will be responsible for each
> > replica. So, 10GB raw  = 30GB replicated.
> >
> > Mark
> >
> >>
> >>
> >> From: Alexander Sicular [mailto:[email protected]]
> >> Sent: Thursday, March 14, 2013 3:51 PM
> >>
> >>
> >> To: Kevin Burton
> >> Cc: [email protected]
> >> Subject: Re: Bigger data than disk space?
> >>
> >>
> >>
> >> Each vnode keeps _an entire copy_ of your data. There is no striping,
> >> which I think you are conflating with RAID. Default replication (also
> >> configured in etc/app.config) is set to three. In which case, three
> >> entire copies of your data are kept on three different vnodes and if
> >> you indeed have five physical nodes in your cluster you are
> >> guaranteed to have each of those three vnodes on different physical
> machines.
> >>
> >>
> >> -Alexander Sicular
> >>
> >>
> >>
> >> @siculars
> >>
> >>
> >>
> >> On Mar 14, 2013, at 4:42 PM, "Kevin Burton"
> >> <[email protected]>
> >> wrote:
> >>
> >>
> >>
> >> Thank you. Let me get it straight. I have a 4 node cluster (4
> >> physical machines). If I have not made any changes to the ring size
> >> then I have
> >> 16
> >> (64/4) vnodes. Each physical node stores the actual data (the value)
> >> of about ¼ of the data size. So when querying the data with a key
> >> given the number of vnodes it can be determined which physical
> >> machine the
> > data is on.
> >> There must be enough redundancy built in so that if one or more of
> >> the physical machines go down the remaining physical machines can
> >> reconstruct the values lost by the lost vnodes. Correct so far? Now
> >> where does replication some in? The documentation indicates that
> >> there are 3 copies of the data (default) made. How is this changed
> >> and how can this replication of the data be taken advantage of?
> >>
> >>
> >>
> >> From: Alexander Sicular [mailto:[email protected]]
> >> Sent: Thursday, March 14, 2013 3:28 PM
> >> To: Kevin Burton
> >> Cc: [email protected]
> >> Subject: Re: Bigger data than disk space?
> >>
> >>
> >>
> >> Hi Kevin,
> >>
> >>
> >>
> >> The Riak distribution model is not based on "buckets" but rather the
> >> hash of the bucket/key combination. That hash (and associated data)
> >> is then allocated against a "vnode". A vnode, in turn, is one of n
> >> where n is the ring_creation_size (default is 64, modify in
> etc/app.config).
> >> Each physical machine in a Riak cluster claims an equal share of the
> >> ring. For example, a cluster with five machines (the recommended
> >> minimum for a production
> >> cluster) and the default ring_creation_size will have 64/5 vnodes per
> >> physical machine (not sure if they round down or up but all machines
> >> will have about the same number of vnodes). What you would do to make
> >> more data available is either add a machine to the cluster whose
> >> available disk space is equal or greater than the cluster member with
> >> the least amount of total space or increase the space on all machines
> > already in the cluster.
> >>
> >>
> >>
> >> tl;dr add a machine to your cluster.
> >>
> >>
> >>
> >>
> >> -Alexander Sicular
> >>
> >>
> >>
> >> @siculars
> >>
> >>
> >>
> >> On Mar 14, 2013, at 3:41 PM, Kevin Burton <[email protected]>
> > wrote:
> >>
> >>
> >>
> >>
> >> I am relatively new to Riak so forgive me if this has been asked
> >> before. I have a very thin understanding of a Riak cluster and
> >> understand somewhat about replication. In planning I foresee a time
> >> when the amount of data exceeds the disk space that is available to a
> >> single node. What facilities are there to essentially “split” a
> >> bucket across several servers? How is this handled?
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> [email protected]
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> [email protected]
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >
>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to