Kevin,

On Thu, Mar 14, 2013 at 1:56 PM, Kevin Burton <[email protected]> wrote:
> So that is what I am missing. If each vnode keeps an entire copy of my data
> and I have 4 physical node then there are 16 vnodes per physical node. That
> would mean I have the data replicated 16 times per physical node. 10 GB
> turns into 160GB etc. Right? So won’t I run out of disk space?
>

Your raw data set is replicated 3 times by default. Three different
vnodes of your total (by default 64) will be responsible for each
replica. So, 10GB raw  = 30GB replicated.

Mark

>
>
> From: Alexander Sicular [mailto:[email protected]]
> Sent: Thursday, March 14, 2013 3:51 PM
>
>
> To: Kevin Burton
> Cc: [email protected]
> Subject: Re: Bigger data than disk space?
>
>
>
> Each vnode keeps _an entire copy_ of your data. There is no striping, which
> I think you are conflating with RAID. Default replication (also configured
> in etc/app.config) is set to three. In which case, three entire copies of
> your data are kept on three different vnodes and if you indeed have five
> physical nodes in your cluster you are guaranteed to have each of those
> three vnodes on different physical machines.
>
>
> -Alexander Sicular
>
>
>
> @siculars
>
>
>
> On Mar 14, 2013, at 4:42 PM, "Kevin Burton" <[email protected]>
> wrote:
>
>
>
> Thank you. Let me get it straight. I have a 4 node cluster (4 physical
> machines). If I have not made any changes to the ring size then I have 16
> (64/4) vnodes. Each physical node stores the actual data (the value) of
> about ¼ of the data size. So when querying the data with a key given the
> number of vnodes it can be determined which physical machine the data is on.
> There must be enough redundancy built in so that if one or more of the
> physical machines go down the remaining physical machines can reconstruct
> the values lost by the lost vnodes. Correct so far? Now where does
> replication some in? The documentation indicates that there are 3 copies of
> the data (default) made. How is this changed and how can this replication of
> the data be taken advantage of?
>
>
>
> From: Alexander Sicular [mailto:[email protected]]
> Sent: Thursday, March 14, 2013 3:28 PM
> To: Kevin Burton
> Cc: [email protected]
> Subject: Re: Bigger data than disk space?
>
>
>
> Hi Kevin,
>
>
>
> The Riak distribution model is not based on "buckets" but rather the hash of
> the bucket/key combination. That hash (and associated data) is then
> allocated against a "vnode". A vnode, in turn, is one of n where n is the
> ring_creation_size (default is 64, modify in etc/app.config). Each physical
> machine in a Riak cluster claims an equal share of the ring. For example, a
> cluster with five machines (the recommended minimum for a production
> cluster) and the default ring_creation_size will have 64/5 vnodes per
> physical machine (not sure if they round down or up but all machines will
> have about the same number of vnodes). What you would do to make more data
> available is either add a machine to the cluster whose available disk space
> is equal or greater than the cluster member with the least amount of total
> space or increase the space on all machines already in the cluster.
>
>
>
> tl;dr add a machine to your cluster.
>
>
>
>
> -Alexander Sicular
>
>
>
> @siculars
>
>
>
> On Mar 14, 2013, at 3:41 PM, Kevin Burton <[email protected]> wrote:
>
>
>
>
> I am relatively new to Riak so forgive me if this has been asked before. I
> have a very thin understanding of a Riak cluster and understand somewhat
> about replication. In planning I foresee a time when the amount of data
> exceeds the disk space that is available to a single node. What facilities
> are there to essentially “split” a bucket across several servers? How is
> this handled?
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to