Looks like Dmitry's + your suggestion did the trick. I upgraded the rest of the 
nodes in-place using the method you suggested and the hanging "leave" finally 
handed off it's data.

Thank you!

Dave


On Sep 24, 2013, at 5:14 PM, Brian Sparrow <[email protected]> wrote:

> Ahh, gotcha. 
> 
> Let us know how things go and if we can offer any more assistance.
> 
> Thanks!
> 
> -- 
> Brian Sparrow
> Developer Advocate
> Basho Technologies
> 
> Sent with Sparrow
> 
> On Tuesday, September 24, 2013 at 5:10 PM, David Greenstein wrote:
> 
>> 
>> I'm actually joining new nodes that have the latest version to the cluster. 
>> Once I join the nodes I have the old nodes leave. This has worked great in 
>> the past. I'll use the recommended method next time :)
>> 
>> Dave
>> 
>> On Sep 24, 2013, at 5:05 PM, Brian Sparrow <[email protected]> wrote:
>> 
>>> David,
>>> 
>>> The standard way to kick transfers is setting transfer_limit to 0 and then 
>>> back up to 2(default) or higher(up to 8 without turning other knobs). This 
>>> can be done with `riak-admin transfer_limit 0` then `riak-admin 
>>> transfer_limit 4`.
>>> 
>>> With that said, may I ask why you are upgrading nodes by leaving them and 
>>> then re-joining them back to the cluster? Unless you are changing backend 
>>> properties this should not be necessary and simply taking nodes down, 
>>> upgrading them, and restarting them is the standard way to do a rolling 
>>> upgrade[1].
>>> 
>>> Let us know how things go after kicking the transfers.
>>> 
>>> Thanks!
>>> 
>>> [1] http://docs.basho.com/riak/latest/ops/running/rolling-upgrades/
>>> 
>>> -- 
>>> Brian Sparrow
>>> Developer Advocate
>>> Basho Technologies
>>> 
>>> Sent with Sparrow
>>> 
>>> On Tuesday, September 24, 2013 at 5:00 PM, David Greenstein wrote:
>>> 
>>>> I'm not receiving any errors actually. I'm far from an expert in riak 
>>>> logs, but the console.log messages seem to indicate the handoff is 
>>>> working… just extremely slowly. Here's a snippet from the console.log of 
>>>> the node that is attempting to leave…
>>>> 
>>>> 
>>>> 2013-09-24 20:59:43.991 [info] 
>>>> <0.7140.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff 
>>>> data for partition 
>>>> riak_kv_vnode:685078892498860742907977265335757665463718379520
>>>> 2013-09-24 20:59:43.994 [info] 
>>>> <0.7140.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for 
>>>> partition 685078892498860742907977265335757665463718379520 exited after 
>>>> processing 0 objects
>>>> 2013-09-24 20:59:44.035 [info] 
>>>> <0.7142.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff 
>>>> data for partition 
>>>> riak_kv_vnode:45671926166590716193865151022383844364247891968
>>>> 2013-09-24 20:59:44.038 [info] 
>>>> <0.7142.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for 
>>>> partition 45671926166590716193865151022383844364247891968 exited after 
>>>> processing 0 objects
>>>> 2013-09-24 20:59:44.068 [info] 
>>>> <0.7144.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff 
>>>> data for partition 
>>>> riak_kv_vnode:1415829711164312202009819681693899175291684651008
>>>> 2013-09-24 20:59:44.073 [info] 
>>>> <0.7144.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for 
>>>> partition 1415829711164312202009819681693899175291684651008 exited after 
>>>> processing 0 objects
>>>> 
>>>> The percentages in member_status still have not changed though.
>>>> 
>>>> Thank you again for any help!!!
>>>> 
>>>> Dave
>>>> 
>>>> On Sep 24, 2013, at 3:42 PM, Dmitry Demeshchuk <[email protected]> 
>>>> wrote:
>>>> 
>>>>> Seems like a potential problem with handoff. We had similar problems 
>>>>> upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors 
>>>>> (something like <<"unknown_msg">> or similar).
>>>>> 
>>>>> If that's the case, leave that node be, and do in-place upgrade for the 
>>>>> rest of the nodes, without making them leave the cluster. The third node 
>>>>> will probably leave after that, so you'll be able to re-join it.
>>>>> 
>>>>> 
>>>>> On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein <[email protected]> 
>>>>> wrote:
>>>>>> 
>>>>>> I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two 
>>>>>> nodes that I replaced left the cluster without an issue and the new 
>>>>>> nodes joined without an issue. Now, the next node seams to be in a state 
>>>>>> where it won't leave the cluster. The status is leaving but it has been 
>>>>>> pending for several hours. Perhaps it is due to the pending ownership 
>>>>>> handoff from ring_status that also doesn't seem to be completing. Any 
>>>>>> insight or help on how to "kickstart" the leave would be greatly 
>>>>>> appreciated!
>>>>>> 
>>>>>> Dave
>>>>>> 
>>>>>> [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status
>>>>>> ================================== Claimant 
>>>>>> ===================================
>>>>>> Claimant:  '[email protected]'
>>>>>> Status:     up
>>>>>> Ring Ready: true
>>>>>> 
>>>>>> ============================== Ownership Handoff 
>>>>>> ==============================
>>>>>> Owner:      [email protected]
>>>>>> Next Owner: [email protected]
>>>>>> 
>>>>>> Index: 456719261665907161938651510223838443642478919680
>>>>>>  Waiting on: [riak_kv_vnode]
>>>>>>  Complete:   [riak_pipe_vnode]
>>>>>> 
>>>>>> -------------------------------------------------------------------------------
>>>>>> 
>>>>>> ============================== Unreachable Nodes 
>>>>>> ==============================
>>>>>> All nodes are up and reachable
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin member_status
>>>>>> ================================= Membership 
>>>>>> ==================================
>>>>>> Status     Ring    Pending    Node
>>>>>> -------------------------------------------------------------------------------
>>>>>> leaving    14.1%     14.1%    '[email protected]'
>>>>>> valid      14.1%     14.1%    '[email protected]'
>>>>>> valid      14.1%     14.1%    '[email protected]'
>>>>>> valid      17.2%     15.6%    '[email protected]'
>>>>>> valid      12.5%     14.1%    '[email protected]'
>>>>>> valid      14.1%     14.1%    '[email protected]'
>>>>>> valid      14.1%     14.1%    '[email protected]'
>>>>>> -------------------------------------------------------------------------------
>>>>>> Valid:6 / Leaving:1 / Exiting:0 / Joining:0 / Down:0
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> riak-users mailing list
>>>>>> [email protected]
>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Best regards,
>>>>> Dmitry Demeshchuk
>>>> 
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> [email protected]
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>> 
> 

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to