Hi Chris, The setting is "handoff_concurrency" not "max_concurrency". Some details about the setting can be found in the documentation [1] [2], however, it looks like we could add a bit more detail in places. The default value is 2.
Regarding adding new nodes if you have the ability to test this in your environment I would suggest trying to add them all at once or in a few groups (using staged clustering) because this will result in less overall transfers. Instead you can adjust handoff_concurrency on different nodes to control the load on the cluster. Adjusting the ring size of a live cluster is something we are working on. A branch can be found on GitHub [3]. Unfortunately it is unclear when this will land in Riak. Cheers, Jordan [1] http://docs.basho.com/riak/latest/references/Command-Line-Tools---riak-admin/#transfer-limit [2] http://docs.basho.com/riak/latest/references/Configuration-Files/#app-config [3] https://github.com/basho/riak_core/commits/jrw-dynamic-ring On Mon, Mar 11, 2013 at 4:21 AM, Chris Read <[email protected]> wrote: > Thanks Mark... > > > On Sat, Mar 9, 2013 at 1:29 AM, Mark Phillips <[email protected]> wrote: > >> Hi Chris, >> >> Thanks for the detailed write up. These are some great data points. >> >> We're doing some work right now to make large rings (where "large" = >> more than 512 partitions) more efficient in terms of start and >> convergence time, and handoff. >> >> > Good to hear. > > >> First things first: since your test cluster has no data in it, adding >> "forced_ownership_handoff" to your riak_core section of your >> app.config and up'ing it to something higher than your ring size >> should help hasten convergence. *This is only useful for the purposes >> of testing and should not be done in production.* That would look like >> this: >> >> {forced_ownership_handoff, 512} >> >> > I'm doing this to understand current production problems. If it should not > be done in production then I'm not interested :D > > >> You could also increase the "max_concurrency" setting (which also has >> to be added to the riak_core section in your app.config). This >> defaults to "2". You could also look at lowering the >> "vnode_management_timer" from "10000" (10 seconds by default). >> >> > Can you point me at more documentation for "max_concurrency"? I've looked > at the code, and it's clear what "vnode_management_timer" does, but my > Elrang-foo is not good enough to be certain on the impacts of > "max_concurrency". How does it interact with "handoff_concurrency" (if at > all) which we currently have at 8? > > >> Back to the current limitations of Riak.. >> >> A few members of the Basho eng team - primarily Joe Blomstedt - have >> been hacking on the ring-relate code for the last week or so are >> making some great progress. The improvements will be in the 1.4 >> release (though it is a few months out from being official). To quote >> Joe from an internal email: "in my current work-in-progress branch, I >> successfully joined 4-nodes together using a 16384 ring yesterday. >> Still took about 20 min, but working on bringing that down even >> further today. Also, impact to cluster performance is worlds >> different." >> >> So, we're well aware of the improvements that need to be made in the >> arena and are working quickly to improve. I think Joe has plans to >> share his working code with the list in the near future (via a GitHub >> PR/Issue I suspect), so look out for that. >> >> In the interim, I would stick with a ring size of 512 or less for >> productions clusters if you're not already live, and lean on some >> beefier hardware to mitigate the current inefficiencies with large >> rings until the code is purified. >> >> > We're already live with a ring size of 512, and we're pretty close to > 100TB of data in there and we're seeing handoff problems already which is > why I'm investigating this further. As for beefiness of hardware, we're > currently running dual 6 core Intel X5675 machines with 96G RAM with 10G > Ethernet links between nodes. We're not going to get beefier any time soon. > > We're currently growing the cluster from the 10 nodes we started with to > 20 to cope with the load, but it takes almost 3 days for handoff to a new > node to happen. We're doing it one by one instead of as a bulk add > operation because with almost 200G per partition the window of "not found" > errors between a new node taking the partition and completing transfer to > serve the data is already uncomfortably high. > > > >> Let us know if you have any other questions. Thanks for your testing >> and patience. >> > > One more question I have is around changing the ring size on a running > cluster. Is it something that you're working on? While we're at less than > about 150TB of data in our cluster we can probably find a way to build a > second cluster and transfer the data over, but we expect to reach the stage > where we'll have over 500TB of data in riak, and at that stage we won't be > able to build a second cluster and don't want to be stuck with almost 1TB > of data per partition... > > Thanks, > > Chris > > >> >> Mark >> >> >> On Fri, Mar 8, 2013 at 6:37 AM, Chris Read <[email protected]> wrote: >> > Greetings all... >> > >> > While I can find lots of documentation about what a ring is and how it's >> > using in Riak, I've found very little that's actually useful about >> > determining the right size for your system. The most useful formula I've >> > found so far has been the simple: >> > >> > ring size = 2 ^ (ceiling(log(max nodes * min partitions per node, 2))) >> > >> > Where the minimum recommended number of partitions per node is 10 (as >> per >> > >> http://docs.basho.com/riak/latest/cookbooks/faqs/operations-faq/#is-it-possible-to-change-the-number-of-partitions >> ). >> > >> > Nothing tells me though what sane upper bound is for the amount of data >> in a >> > partition, or the overhead inside the cluster of managing larger ring >> sizes. >> > My gut feel though is that more than a couple of hundred gigabytes per >> > partition is getting a bit much. >> > >> > I've done some initial testing of ring sizes across a cluster of 9 >> physical >> > machines and have seen some concerning results. All the numbers below >> are >> > done on the same hardware running Ubuntu 12.04 with Riak 1.3.0 (official >> > .deb release): >> > >> > Ring Size | 512 | 1024 | 2048 | >> > Create Cluster | 01:53 | 05:41 | 0:12:58 | >> > Remove Node | 04:01 | 10:31 | 0:31:13 | >> > Add Node | 01:05 | 05:22 | 1:04:49 | >> > >> > All this is done with NO DATA in the cluster at all - so why does it >> take >> > over an hour to add a new node when ring=2048? >> > >> > Does it have anything to do with the concerns raised on this thread: >> > >> https://groups.google.com/forum/?fromgroups=#!topic/nosql-databases/DZkgkgd9YnA >> > >> > Thanks, >> > >> > Chris >> > >> > >> > >> > >> > _______________________________________________ >> > riak-users mailing list >> > [email protected] >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > >> > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
