Hi

Thank you for the guide. I stopped two of the nodes (the source and
the destination of the partition transfers), renamed the folders
inside the merge_index folder and started them again. The ownership
handoff does however not seem to be retried.

Looking at the logs it seems like the last attempt was 48 hours ago.
Is there any logic inside Riak which causes it to give up after a
certain amount of tries?
Is there a way I can retrigger the handoffs?
I have tried to set the transfer-limit on the cluster to 0 and then
back to 2, but it doesn't seem to do anything.

I wonder if we need the merge_index folder at all, as we have disabled
Riak search since the initial configuration of the cluster. We found a
better way to query our data so that we don't need Riak search
anymore. We disabled it by resetting the properties on the buckets
where search was enabled, and then disabled search in app.config
followed by a restart of each of the nodes. This was done after the
ownership handoff issue first occurred.

-- 
Jeppe Fihl Toustrup
Operations Engineer
Falcon Social


On 19 November 2013 23:17, Mark Phillips <[email protected]> wrote:
> Hi Jeppe,
>
>
>
> As you suspected, this looks like index corruption in Search that's
> preventing handoff from finishing.  Specifically, you'll need to delete the
>
> segment files for the two partitions' indexes and rebuild those indexes
> post-transfer.
>
>
> Here's the full process:
>
>
>
> - Stop each node that owns the partitions in question.
> - Delete the data directory for each partition (which contains the segment
> files). It should be something like:
>
>
>
>
> "rm -rf /var/lib/riak/merge_index/<p>"
>
>
> - Restart each node
>
> - Wait for the transfers to complete
> - Rebuild the indexes in question [1]
>
>
> Let us know if you run into any further issues.
>
>
>
> Mark
>
>
> [1]
> http://docs.basho.com/riak/latest/ops/running/recovery/repairing-indexes/
>
>
>
> On Tue, Nov 19, 2013 at 4:26 AM, Jeppe Toustrup <[email protected]>
> wrote:
>>
>> Hi
>>
>> I have recently added two extra nodes to the now seven node Riak
>> cluster. The rebalancing following the expansion worked fine, except
>> for two partitions which seem to not being able to go through. Running
>> "riak-admin ring-status" shows the following:
>>
>> ============================== Ownership Handoff
>> ==============================
>> Owner:      [email protected]
>> Next Owner: [email protected]
>>
>> Index: 239777612374601260017792042867515182912301432832
>>   Waiting on: []
>>   Complete:   [riak_kv_vnode,riak_pipe_vnode]
>>
>> Index: 696496874040508421956443553091353626554780352512
>>   Waiting on: []
>>   Complete:   [riak_kv_vnode,riak_pipe_vnode]
>>
>>
>> -------------------------------------------------------------------------------
>>
>> I can see from the log file on the source node (10.0.0.96) that it has
>> made numerous attempt to transfer the partitions, but it ends up
>> failing all the time. Here's an except of the log file showing the
>> lines from when the transfer attempt ends up failing:
>>
>> 2013-11-18 12:29:03.694 [error] emulator Error in process <0.5745.8>
>> on node '[email protected]' with exit value:
>> {badarg,[{erlang,binary_to_term,[<<29942
>>
>> bytes>>],[]},{mi_segment,iterate_all_bytes,2,[{file,"src/mi_segment.erl"},{line,167}]},{mi_server,'-group_iterator/2-fun-1-',2,[{file,"src/mi_server.erl"},{line,725}]},{mi_server,'-group_iterator/2-fun-0-'...
>> 2013-11-18 12:29:03.885 [error] <0.3269.0>@mi_server:handle_info:524
>> lookup/range failure:
>>
>> {badarg,[{erlang,binary_to_term,[<<131,109,0,0,244,240,108,109,102,97,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1
 
11,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,11
 
1,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111
 
,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,
 111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,...>>],...},...]}
>> 2013-11-18 12:29:03.889 [error]
>> <0.30353.0>@merge_index_backend:async_fold_fun:116 failed to iterate
>> the index with reason
>>
>> {badarg,[{erlang,binary_to_term,[<<131,109,0,0,244,240,108,109,102,97,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1
 
11,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,...>>]
 ,...},...]}
>> and partial acc
>>
>> {{ho_acc,1,{error,closed},#Fun<riak_core_handoff_sender.15.9980475>,riak_search_vnode,<0.3268.0>,#Port<0.133923>,{696496874040508421956443553091353626554780352512,696496874040508421956443553091353626554780352512},{ho_stats,{1384,775537,708230},undefined,249,1050473},gen_tcp,246,1053542,[<<1,131,104,4,109,0,0,0,8,109,111,114,101,111,118,101,114,109,0,0,0,13,99,97,110,111,110,105,99,97,108,68,97,116,101,109,0,0,0,14,50,48,49,51,49,48,48,55,49,50,49,49,53,52,108,0,0,0,57,104,3,109,0,0,0,47,50,48,49,51,49,48,48,55,49,50,49,49,53,52,35,102,98,56,55,98,102,50,51,55,51,98,51,98,56,56,100,52,48,51,52,51,50,100,97,97,55,51,99,55,101,98,99,108,0,0,0,1,104,2,100,0,1,112,107,0,1,0,106,110,7,0,154,61,194,52,243,234,4,104,3,109,0,0,0,47,50,48,49,51,49,48,48,55,49,50,49,49,53,52,35,102,51,101,54,55,54,57,56,99,55,52,49,102,57,56,50,101,50,102,48,55,54,101,57,57,101,97,56,100,100,102,57,108,0,0,0,1,104,2,100,0,1,112,107,0,1,0,106,110,7,0,137,181,162,60,243,234,4,104,3,109,0,0,0,47,50,48,49,51,49,
 
48,48,55,49,50,49,49,53,52,35,101,98,57,51,102,52,55,48,53,52,55,52,50,56,98,49,49,50,101,57,98,50,98,49,52,51,52,57,52,52,98,50,108,0,0,0,1,104,2,100,0,1,112,107,0,1,0,106,110,7,0,31,143,101,69,243,234,4,104,3,109,0,0,0,47,50,48,49,51,49,48,48,55,49,50,49,49,53,52,35,101,55,49,52,57,49,53,52,49,102,48,55,102,102,97,99,51,48,98,51,52,51,56,52,49,98,56,50,54,55,56,48,108,0,0,0,1,104,2,100,0,1,112,107,0,1,0,106,110,7,0,214,30,30,54,243,234,4,104,3,109,0,0,0,47,50,48,49,51,49,48,48,55,49,50,49,49,53,52,35,101,53,56,53,97,52,48,49,98,55,97,52,56,57,56,48,53,55,54,53,101,99,98,98,98,98,101,52,97,101,48,100,108,0,0,0,1,104,2,100,0,1,112,107,0,1,0,106,110,7,0,41,65,120,56,243,234,4,104,3,109,0,0,0,47,50,48,49,51,49,48,48,55,49,50,49,49,53,52,35,100,101,56,99,99,99,51,56,53,56,52,48,53,97,51,49,57,49,48,97,51,55,57,57,98,49,54,52,53,100,55,50,108,0,0,0,1,104,2,100,0,1,112,107,0,1,0,106,110,7,0,222,186,251,66,243,234,4,104,3,109,0,0,0,47,50,48,49,51,49,48,48,55,49,50,49,49,...>>,...],...},..
 .}
>> 2013-11-18 12:29:03.889 [error]
>> <0.15384.7>@riak_core_handoff_sender:start_fold:269 ownership_transfer
>> transfer of riak_search_vnode from '[email protected]'
>> 696496874040508421956443553091353626554780352512 to '[email protected]'
>> 696496874040508421956443553091353626554780352512 failed because of
>> error:{badrecord,ho_acc}
>>
>> [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,193}]}]
>>
>>
>> I've got an idea that it me be caused by corrupt data, but the
>> question is then how I can get this corrected, enabling the transfer
>> to complete. Any suggestions?
>>
>> We are using Riak 1.4.2 with LevelDB as the backend on Ubuntu 12.04.
>> Riak was installed through the apt.basho.com repository. Let me know
>> if you need any more information.
>>
>> --
>> Jeppe Fihl Toustrup
>> Operations Engineer
>> Falcon Social
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to