Looks like Dmitry's + your suggestion did the trick. I upgraded the rest of the nodes in-place using the method you suggested and the hanging "leave" finally handed off it's data.
Thank you! Dave On Sep 24, 2013, at 5:14 PM, Brian Sparrow <[email protected]> wrote: > Ahh, gotcha. > > Let us know how things go and if we can offer any more assistance. > > Thanks! > > -- > Brian Sparrow > Developer Advocate > Basho Technologies > > Sent with Sparrow > > On Tuesday, September 24, 2013 at 5:10 PM, David Greenstein wrote: > >> >> I'm actually joining new nodes that have the latest version to the cluster. >> Once I join the nodes I have the old nodes leave. This has worked great in >> the past. I'll use the recommended method next time :) >> >> Dave >> >> On Sep 24, 2013, at 5:05 PM, Brian Sparrow <[email protected]> wrote: >> >>> David, >>> >>> The standard way to kick transfers is setting transfer_limit to 0 and then >>> back up to 2(default) or higher(up to 8 without turning other knobs). This >>> can be done with `riak-admin transfer_limit 0` then `riak-admin >>> transfer_limit 4`. >>> >>> With that said, may I ask why you are upgrading nodes by leaving them and >>> then re-joining them back to the cluster? Unless you are changing backend >>> properties this should not be necessary and simply taking nodes down, >>> upgrading them, and restarting them is the standard way to do a rolling >>> upgrade[1]. >>> >>> Let us know how things go after kicking the transfers. >>> >>> Thanks! >>> >>> [1] http://docs.basho.com/riak/latest/ops/running/rolling-upgrades/ >>> >>> -- >>> Brian Sparrow >>> Developer Advocate >>> Basho Technologies >>> >>> Sent with Sparrow >>> >>> On Tuesday, September 24, 2013 at 5:00 PM, David Greenstein wrote: >>> >>>> I'm not receiving any errors actually. I'm far from an expert in riak >>>> logs, but the console.log messages seem to indicate the handoff is >>>> working… just extremely slowly. Here's a snippet from the console.log of >>>> the node that is attempting to leave… >>>> >>>> >>>> 2013-09-24 20:59:43.991 [info] >>>> <0.7140.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff >>>> data for partition >>>> riak_kv_vnode:685078892498860742907977265335757665463718379520 >>>> 2013-09-24 20:59:43.994 [info] >>>> <0.7140.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for >>>> partition 685078892498860742907977265335757665463718379520 exited after >>>> processing 0 objects >>>> 2013-09-24 20:59:44.035 [info] >>>> <0.7142.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff >>>> data for partition >>>> riak_kv_vnode:45671926166590716193865151022383844364247891968 >>>> 2013-09-24 20:59:44.038 [info] >>>> <0.7142.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for >>>> partition 45671926166590716193865151022383844364247891968 exited after >>>> processing 0 objects >>>> 2013-09-24 20:59:44.068 [info] >>>> <0.7144.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff >>>> data for partition >>>> riak_kv_vnode:1415829711164312202009819681693899175291684651008 >>>> 2013-09-24 20:59:44.073 [info] >>>> <0.7144.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for >>>> partition 1415829711164312202009819681693899175291684651008 exited after >>>> processing 0 objects >>>> >>>> The percentages in member_status still have not changed though. >>>> >>>> Thank you again for any help!!! >>>> >>>> Dave >>>> >>>> On Sep 24, 2013, at 3:42 PM, Dmitry Demeshchuk <[email protected]> >>>> wrote: >>>> >>>>> Seems like a potential problem with handoff. We had similar problems >>>>> upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors >>>>> (something like <<"unknown_msg">> or similar). >>>>> >>>>> If that's the case, leave that node be, and do in-place upgrade for the >>>>> rest of the nodes, without making them leave the cluster. The third node >>>>> will probably leave after that, so you'll be able to re-join it. >>>>> >>>>> >>>>> On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein <[email protected]> >>>>> wrote: >>>>>> >>>>>> I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two >>>>>> nodes that I replaced left the cluster without an issue and the new >>>>>> nodes joined without an issue. Now, the next node seams to be in a state >>>>>> where it won't leave the cluster. The status is leaving but it has been >>>>>> pending for several hours. Perhaps it is due to the pending ownership >>>>>> handoff from ring_status that also doesn't seem to be completing. Any >>>>>> insight or help on how to "kickstart" the leave would be greatly >>>>>> appreciated! >>>>>> >>>>>> Dave >>>>>> >>>>>> [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status >>>>>> ================================== Claimant >>>>>> =================================== >>>>>> Claimant: '[email protected]' >>>>>> Status: up >>>>>> Ring Ready: true >>>>>> >>>>>> ============================== Ownership Handoff >>>>>> ============================== >>>>>> Owner: [email protected] >>>>>> Next Owner: [email protected] >>>>>> >>>>>> Index: 456719261665907161938651510223838443642478919680 >>>>>> Waiting on: [riak_kv_vnode] >>>>>> Complete: [riak_pipe_vnode] >>>>>> >>>>>> ------------------------------------------------------------------------------- >>>>>> >>>>>> ============================== Unreachable Nodes >>>>>> ============================== >>>>>> All nodes are up and reachable >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin member_status >>>>>> ================================= Membership >>>>>> ================================== >>>>>> Status Ring Pending Node >>>>>> ------------------------------------------------------------------------------- >>>>>> leaving 14.1% 14.1% '[email protected]' >>>>>> valid 14.1% 14.1% '[email protected]' >>>>>> valid 14.1% 14.1% '[email protected]' >>>>>> valid 17.2% 15.6% '[email protected]' >>>>>> valid 12.5% 14.1% '[email protected]' >>>>>> valid 14.1% 14.1% '[email protected]' >>>>>> valid 14.1% 14.1% '[email protected]' >>>>>> ------------------------------------------------------------------------------- >>>>>> Valid:6 / Leaving:1 / Exiting:0 / Joining:0 / Down:0 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> [email protected] >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> Dmitry Demeshchuk >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> [email protected] >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >> >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
