Re: Questions about ring size

Jordan West Tue, 12 Mar 2013 22:46:05 -0700

Hi Chris,

The setting is "handoff_concurrency" not "max_concurrency". Some details
about the setting
can be found in the documentation [1] [2], however, it looks like we could
add a bit more
detail in places. The default value is 2.


Regarding adding new nodes if you have the ability to test this in your
environment I would
suggest trying to add them all at once or in a few groups (using staged
clustering) because
this will result in less overall transfers. Instead you can adjust
handoff_concurrency on
different nodes to control the load on the cluster.

Adjusting the ring size of a live cluster is something we are working on. A
branch can be found
on GitHub [3]. Unfortunately it is unclear when this will land in Riak.

Cheers,
Jordan


[1]
http://docs.basho.com/riak/latest/references/Command-Line-Tools---riak-admin/#transfer-limit
[2]
http://docs.basho.com/riak/latest/references/Configuration-Files/#app-config
[3] https://github.com/basho/riak_core/commits/jrw-dynamic-ring

On Mon, Mar 11, 2013 at 4:21 AM, Chris Read <[email protected]> wrote:

> Thanks Mark...
>
>
> On Sat, Mar 9, 2013 at 1:29 AM, Mark Phillips <[email protected]> wrote:
>
>> Hi Chris,
>>
>> Thanks for the detailed write up. These are some great data points.
>>
>> We're doing some work right now to make large rings (where "large" =
>> more than 512 partitions) more efficient in terms of start and
>> convergence time, and handoff.
>>
>>
> Good to hear.
>
>
>> First things first: since your test cluster has no data in it, adding
>> "forced_ownership_handoff" to your riak_core section of your
>> app.config and up'ing it to something higher than your ring size
>> should help hasten convergence. *This is only useful for the purposes
>> of testing and should not be done in production.* That would look like
>> this:
>>
>> {forced_ownership_handoff, 512}
>>
>>
> I'm doing this to understand current production problems. If it should not
> be done in production then I'm not interested :D
>
>
>> You could also increase the "max_concurrency" setting (which also has
>> to be added to the riak_core section in your app.config). This
>> defaults to "2". You could also look at lowering the
>> "vnode_management_timer" from "10000" (10 seconds by default).
>>
>>
> Can you point me at more documentation for "max_concurrency"? I've looked
> at the code, and it's clear what "vnode_management_timer" does, but my
> Elrang-foo is not good enough to be certain on the impacts of
> "max_concurrency". How does it interact with "handoff_concurrency" (if at
> all) which we currently have at 8?
>
>
>> Back to the current limitations of Riak..
>>
>> A few members of the Basho eng team - primarily Joe Blomstedt - have
>> been hacking on the ring-relate code for the last week or so are
>> making some great progress. The improvements will be in the 1.4
>> release (though it is a few months out from being official). To quote
>> Joe from an internal email: "in my current work-in-progress branch, I
>> successfully joined 4-nodes together using a 16384 ring yesterday.
>> Still took about 20 min, but working on bringing that down even
>> further today. Also, impact to cluster performance is worlds
>> different."
>>
>> So, we're well aware of the improvements that need to be made in the
>> arena and are working quickly to improve. I think Joe has plans to
>> share his working code with the list in the near future (via a GitHub
>> PR/Issue I suspect), so look out for that.
>>
>> In the interim, I would stick with a ring size of 512 or less for
>> productions clusters if you're not already live, and lean on some
>> beefier hardware to mitigate the current inefficiencies with large
>> rings until the code is purified.
>>
>>
> We're already live with a ring size of 512, and we're pretty close to
> 100TB of data in there and we're seeing handoff problems already which is
> why I'm investigating this further. As for beefiness of hardware, we're
> currently running dual 6 core Intel X5675 machines with 96G RAM with 10G
> Ethernet links between nodes. We're not going to get beefier any time soon.
>
> We're currently growing the cluster from the 10 nodes we started with to
> 20 to cope with the load, but it takes almost 3 days for handoff to a new
> node to happen. We're doing it one by one instead of as a bulk add
> operation because with almost 200G per partition the window of "not found"
> errors between a new node taking the partition and completing transfer to
> serve the data is already uncomfortably high.
>
>
>
>> Let us know if you have any other questions. Thanks for your testing
>> and patience.
>>
>
> One more question I have is around changing the ring size on a running
> cluster. Is it something that you're working on? While we're at less than
> about 150TB of data in our cluster we can probably find a way to build a
> second cluster and transfer the data over, but we expect to reach the stage
> where we'll have over 500TB of data in riak, and at that stage we won't be
> able to build a second cluster and don't want to be stuck with almost 1TB
> of data per partition...
>
> Thanks,
>
> Chris
>
>
>>
>> Mark
>>
>>
>> On Fri, Mar 8, 2013 at 6:37 AM, Chris Read <[email protected]> wrote:
>> > Greetings all...
>> >
>> > While I can find lots of documentation about what a ring is and how it's
>> > using in Riak, I've found very little that's actually useful about
>> > determining the right size for your system. The most useful formula I've
>> > found so far has been the simple:
>> >
>> > ring size = 2 ^ (ceiling(log(max nodes * min partitions per node, 2)))
>> >
>> > Where the minimum recommended number of partitions per node is 10 (as
>> per
>> >
>> http://docs.basho.com/riak/latest/cookbooks/faqs/operations-faq/#is-it-possible-to-change-the-number-of-partitions
>> ).
>> >
>> > Nothing tells me though what sane upper bound is for the amount of data
>> in a
>> > partition, or the overhead inside the cluster of managing larger ring
>> sizes.
>> > My gut feel though is that more than a couple of hundred gigabytes per
>> > partition is getting a bit much.
>> >
>> > I've done some initial testing of ring sizes across a cluster of 9
>> physical
>> > machines and have seen some concerning results. All the numbers below
>> are
>> > done on the same hardware running Ubuntu 12.04 with Riak 1.3.0 (official
>> > .deb release):
>> >
>> > Ring Size      |   512 |  1024 |    2048 |
>> > Create Cluster | 01:53 | 05:41 | 0:12:58 |
>> > Remove Node    | 04:01 | 10:31 | 0:31:13 |
>> > Add Node       | 01:05 | 05:22 | 1:04:49 |
>> >
>> > All this is done with NO DATA in the cluster at all - so why does it
>> take
>> > over an hour to add a new node when ring=2048?
>> >
>> > Does it have anything to do with the concerns raised on this thread:
>> >
>> https://groups.google.com/forum/?fromgroups=#!topic/nosql-databases/DZkgkgd9YnA
>> >
>> > Thanks,
>> >
>> > Chris
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > riak-users mailing list
>> > [email protected]
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Questions about ring size

Reply via email to