Hey Rich,

This looks like a known and since-resolved bug that was in 1.2.0. Riak
Search could get into a state where transfers would never finish and
bad things would happen. See the following:

http://docs.basho.com/riak/latest/references/Riak-Search---Settings/
https://github.com/basho/merge_index/pull/24

Sorry for the lateness. In the event that you're still running Riak
and haven't upgraded, this is fixed in 1.2.1.

Mark

On Mon, Sep 24, 2012 at 12:47 PM, Rich Sutton <[email protected]> wrote:
> More information ...
>
> These errors come about a minute before the errors below:
>
> 2012-09-24 12:45:15.607 [info]
> <0.6099.0>@riak_core_handoff_sender:start_fold:126 Starting
> ownership_handoff transfer of riak_search_vnode from
> '[email protected]' 1096126227998177188652763624537212264741949407232
> to '[email protected]'
> 1096126227998177188652763624537212264741949407232
> 2012-09-24 12:45:15.607 [info]
> <0.6098.0>@riak_core_handoff_sender:start_fold:126 Starting
> ownership_handoff transfer of riak_search_vnode from
> '[email protected]' 1004782375664995756265033322492444576013453623296
> to '[email protected]'
> 1004782375664995756265033322492444576013453623296
> 2012-09-24 12:45:15.936 [error] <0.1261.0>@mi_server:handle_info:549
> Unexpected info {#Port<0.7254>,{data,[2,0,0,0,0,0,0,0,1|<<128>>]}}
> 2012-09-24 12:45:16.886 [error] <0.1297.0>@mi_server:handle_info:549
> Unexpected info {#Port<0.7250>,{data,[2,0,0,0,0,0,0,0,1|<<0>>]}}
>
> Incidentally, this happens about once every minute.  Some amount of memory
> appears to leak, as the beam process reserved memory grows some each minute
> until the kernel's OOM killer kicks in after a while.
>
> Rich
>
> On Mon, Sep 24, 2012 at 12:09 PM, Rich Sutton <[email protected]>
> wrote:
>>
>> Hi,
>>
>> I've got a two node riak cluster set up for testing.  After joining the
>> second node to the cluster, I've got some failing transfers.  Restarts on
>> both nodes don't resolve the situation.  Any ideas?
>>
>> From error.log on transferrer node (sylvester.soiq.net):
>>
>> 2012-09-24 12:06:35.598 [error] <0.3180.0> gen_server <0.3180.0>
>> terminated with reason: bad return value: lookup_timeout
>> 2012-09-24 12:06:35.599 [error]
>> <0.3276.0>@riak_core_handoff_sender:start_fold:215 ownership_handoff
>> transfer of riak_search_vnode from '[email protected]'
>> 1004782375664995756265033322492444576013453623296 to
>> '[email protected]' 1004782375664995756265033322492444576013453623296
>> failed because of
>> error:{badmatch,{error,{worker_crash,{bad_return_value,lookup_timeout},{fold,#Fun<merge_index_backend.1.120989340>,#Fun<riak_search_vnode.1.104462514>}}}}
>> [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,161}]}]
>> 2012-09-24 12:06:35.617 [error]
>> <0.3277.0>@riak_core_handoff_sender:start_fold:215 ownership_handoff
>> transfer of riak_search_vnode from '[email protected]'
>> 1096126227998177188652763624537212264741949407232 to
>> '[email protected]' 1096126227998177188652763624537212264741949407232
>> failed because of
>> error:{badmatch,{error,{worker_crash,{bad_return_value,lookup_timeout},{fold,#Fun<merge_index_backend.1.120989340>,#Fun<riak_search_vnode.1.104462514>}}}}
>> [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,161}]}]
>> 2012-09-24 12:06:35.618 [error] <0.3180.0> CRASH REPORT Process <0.3180.0>
>> with 0 neighbours exited with reason: bad return value: lookup_timeout in
>> gen_server:terminate/6 line 747
>> 2012-09-24 12:06:35.709 [error] <0.1293.0> Supervisor poolboy_sup had
>> child riak_core_vnode_worker started with
>> {riak_core_vnode_worker,start_link,undefined} at <0.3180.0> exit with reason
>> bad return value: lookup_timeout in context child_terminated
>> 2012-09-24 12:06:35.730 [error] <0.3181.0> gen_server <0.3181.0>
>> terminated with reason: bad return value: lookup_timeout
>> 2012-09-24 12:06:35.753 [error] <0.3181.0> CRASH REPORT Process <0.3181.0>
>> with 0 neighbours exited with reason: bad return value: lookup_timeout in
>> gen_server:terminate/6 line 747
>> 2012-09-24 12:06:35.773 [error] <0.1310.0> Supervisor poolboy_sup had
>> child riak_core_vnode_worker started with
>> {riak_core_vnode_worker,start_link,undefined} at <0.3181.0> exit with reason
>> bad return value: lookup_timeout in context child_terminated
>>
>>
>> rich@daffyduck:~$ sudo riak-admin member-status
>> Attempting to restart script through sudo -H -u riak
>> ================================= Membership
>> ==================================
>> Status     Ring    Pending    Node
>>
>> -------------------------------------------------------------------------------
>> valid      37.5%     50.0%    '[email protected]'
>> valid      62.5%     50.0%    '[email protected]'
>>
>> -------------------------------------------------------------------------------
>> Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
>>
>> rich@daffyduck:~$ sudo riak-admin transfers
>> Attempting to restart script through sudo -H -u riak
>> '[email protected]' waiting to handoff 8 partitions
>>
>> Active Transfers:
>>
>> transfer type: ownership_handoff
>> vnode type: riak_search_vnode
>> partition: 1004782375664995756265033322492444576013453623296
>> started: 2012-09-24 18:24:41 [-81984015.00 us ago]
>> last update: no updates seen
>> objects transferred: unknown
>>
>>                          unknown
>> [email protected] =======================> [email protected]
>>                          unknown
>>
>> transfer type: ownership_handoff
>> vnode type: riak_search_vnode
>> partition: 1096126227998177188652763624537212264741949407232
>> started: 2012-09-24 18:24:51 [-91982788.00 us ago]
>> last update: no updates seen
>> objects transferred: unknown
>>
>>                          unknown
>> [email protected] =======================> [email protected]
>>                          unknown
>>
>> Rich
>>
>
>
>
> --
> Rich Sutton
> CTO / VP of Engineering
> socialiqnetworks.com
> mobile 714-318-7737
>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to