Hey Rich, This looks like a known and since-resolved bug that was in 1.2.0. Riak Search could get into a state where transfers would never finish and bad things would happen. See the following:
http://docs.basho.com/riak/latest/references/Riak-Search---Settings/ https://github.com/basho/merge_index/pull/24 Sorry for the lateness. In the event that you're still running Riak and haven't upgraded, this is fixed in 1.2.1. Mark On Mon, Sep 24, 2012 at 12:47 PM, Rich Sutton <[email protected]> wrote: > More information ... > > These errors come about a minute before the errors below: > > 2012-09-24 12:45:15.607 [info] > <0.6099.0>@riak_core_handoff_sender:start_fold:126 Starting > ownership_handoff transfer of riak_search_vnode from > '[email protected]' 1096126227998177188652763624537212264741949407232 > to '[email protected]' > 1096126227998177188652763624537212264741949407232 > 2012-09-24 12:45:15.607 [info] > <0.6098.0>@riak_core_handoff_sender:start_fold:126 Starting > ownership_handoff transfer of riak_search_vnode from > '[email protected]' 1004782375664995756265033322492444576013453623296 > to '[email protected]' > 1004782375664995756265033322492444576013453623296 > 2012-09-24 12:45:15.936 [error] <0.1261.0>@mi_server:handle_info:549 > Unexpected info {#Port<0.7254>,{data,[2,0,0,0,0,0,0,0,1|<<128>>]}} > 2012-09-24 12:45:16.886 [error] <0.1297.0>@mi_server:handle_info:549 > Unexpected info {#Port<0.7250>,{data,[2,0,0,0,0,0,0,0,1|<<0>>]}} > > Incidentally, this happens about once every minute. Some amount of memory > appears to leak, as the beam process reserved memory grows some each minute > until the kernel's OOM killer kicks in after a while. > > Rich > > On Mon, Sep 24, 2012 at 12:09 PM, Rich Sutton <[email protected]> > wrote: >> >> Hi, >> >> I've got a two node riak cluster set up for testing. After joining the >> second node to the cluster, I've got some failing transfers. Restarts on >> both nodes don't resolve the situation. Any ideas? >> >> From error.log on transferrer node (sylvester.soiq.net): >> >> 2012-09-24 12:06:35.598 [error] <0.3180.0> gen_server <0.3180.0> >> terminated with reason: bad return value: lookup_timeout >> 2012-09-24 12:06:35.599 [error] >> <0.3276.0>@riak_core_handoff_sender:start_fold:215 ownership_handoff >> transfer of riak_search_vnode from '[email protected]' >> 1004782375664995756265033322492444576013453623296 to >> '[email protected]' 1004782375664995756265033322492444576013453623296 >> failed because of >> error:{badmatch,{error,{worker_crash,{bad_return_value,lookup_timeout},{fold,#Fun<merge_index_backend.1.120989340>,#Fun<riak_search_vnode.1.104462514>}}}} >> [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,161}]}] >> 2012-09-24 12:06:35.617 [error] >> <0.3277.0>@riak_core_handoff_sender:start_fold:215 ownership_handoff >> transfer of riak_search_vnode from '[email protected]' >> 1096126227998177188652763624537212264741949407232 to >> '[email protected]' 1096126227998177188652763624537212264741949407232 >> failed because of >> error:{badmatch,{error,{worker_crash,{bad_return_value,lookup_timeout},{fold,#Fun<merge_index_backend.1.120989340>,#Fun<riak_search_vnode.1.104462514>}}}} >> [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,161}]}] >> 2012-09-24 12:06:35.618 [error] <0.3180.0> CRASH REPORT Process <0.3180.0> >> with 0 neighbours exited with reason: bad return value: lookup_timeout in >> gen_server:terminate/6 line 747 >> 2012-09-24 12:06:35.709 [error] <0.1293.0> Supervisor poolboy_sup had >> child riak_core_vnode_worker started with >> {riak_core_vnode_worker,start_link,undefined} at <0.3180.0> exit with reason >> bad return value: lookup_timeout in context child_terminated >> 2012-09-24 12:06:35.730 [error] <0.3181.0> gen_server <0.3181.0> >> terminated with reason: bad return value: lookup_timeout >> 2012-09-24 12:06:35.753 [error] <0.3181.0> CRASH REPORT Process <0.3181.0> >> with 0 neighbours exited with reason: bad return value: lookup_timeout in >> gen_server:terminate/6 line 747 >> 2012-09-24 12:06:35.773 [error] <0.1310.0> Supervisor poolboy_sup had >> child riak_core_vnode_worker started with >> {riak_core_vnode_worker,start_link,undefined} at <0.3181.0> exit with reason >> bad return value: lookup_timeout in context child_terminated >> >> >> rich@daffyduck:~$ sudo riak-admin member-status >> Attempting to restart script through sudo -H -u riak >> ================================= Membership >> ================================== >> Status Ring Pending Node >> >> ------------------------------------------------------------------------------- >> valid 37.5% 50.0% '[email protected]' >> valid 62.5% 50.0% '[email protected]' >> >> ------------------------------------------------------------------------------- >> Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 >> >> rich@daffyduck:~$ sudo riak-admin transfers >> Attempting to restart script through sudo -H -u riak >> '[email protected]' waiting to handoff 8 partitions >> >> Active Transfers: >> >> transfer type: ownership_handoff >> vnode type: riak_search_vnode >> partition: 1004782375664995756265033322492444576013453623296 >> started: 2012-09-24 18:24:41 [-81984015.00 us ago] >> last update: no updates seen >> objects transferred: unknown >> >> unknown >> [email protected] =======================> [email protected] >> unknown >> >> transfer type: ownership_handoff >> vnode type: riak_search_vnode >> partition: 1096126227998177188652763624537212264741949407232 >> started: 2012-09-24 18:24:51 [-91982788.00 us ago] >> last update: no updates seen >> objects transferred: unknown >> >> unknown >> [email protected] =======================> [email protected] >> unknown >> >> Rich >> > > > > -- > Rich Sutton > CTO / VP of Engineering > socialiqnetworks.com > mobile 714-318-7737 > > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
