Hi, We have been using riak to gather our test data and analyze results after test completes. Recently we have observed riak crash in riak console logs. This causes our tests failing to record data to riak and bailing out :-(
The crash logs are as follow:
2016-02-19 16:25:26.255 [error] <0.2160.0> gen_fsm <0.2160.0> in state active
terminated with reason: no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195
2016-02-19 16:25:26.260 [error] <0.2160.0> CRASH REPORT Process <0.2160.0> with
2 neighbours exited with reason: no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195 in gen_fsm:terminate/7 line 622
2016-02-19 16:25:26.260 [error] <0.172.0> Supervisor riak_core_vnode_sup had
child undefined started with {riak_core_vnode,start_link,undefined} at
<0.2160.0> exit with reason no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195 in context child_terminated
2016-02-19 16:25:26.261 [error] <0.4319.0> gen_fsm <0.4319.0> in state ready
terminated with reason: no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195
2016-02-19 16:25:26.275 [error] <0.4319.0> CRASH REPORT Process <0.4319.0> with
10 neighbours exited with reason: no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195 in gen_fsm:terminate/7 line 622
2016-02-19 16:25:26.278 [error] <0.4320.0> Supervisor {<0.4320.0>,poolboy_sup}
had child riak_core_vnode_worker started with
riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[268322566228720457638957762256505085639956365312,...]},...])
at undefined exit with reason no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195 in context shutdown_error
2016-02-19 16:25:26.278 [error] <0.4320.0> gen_server <0.4320.0> terminated
with reason: no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195
2016-02-19 16:25:26.278 [error] <0.4320.0> CRASH REPORT Process <0.4320.0> with
0 neighbours exited with reason: no function clause matching
riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}},
{state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
line 1195 in gen_server:terminate/6 line 744
2016-02-19 16:25:26.806 [error] <0.2157.0> gen_fsm <0.2157.0> in state active
terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
2016-02-19 16:25:26.808 [error] <0.2157.0> CRASH REPORT Process <0.2157.0> with
2 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
in gen_fsm:terminate/7 line 600
2016-02-19 16:25:26.809 [error] <0.5450.0> gen_fsm <0.5450.0> in state ready
terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
2016-02-19 16:25:26.809 [error] <0.172.0> Supervisor riak_core_vnode_sup had
child undefined started with {riak_core_vnode,start_link,undefined} at
<0.2157.0> exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in
context child_terminated
2016-02-19 16:25:26.809 [error] <0.5450.0> CRASH REPORT Process <0.5450.0> with
10 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
in gen_fsm:terminate/7 line 622
2016-02-19 16:25:26.809 [error] <0.5451.0> Supervisor {<0.5451.0>,poolboy_sup}
had child riak_core_vnode_worker started with
riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[211232658520482062396626323478525280184646500352,...]},...])
at undefined exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in
context shutdown_error
2016-02-19 16:25:26.809 [error] <0.5451.0> gen_server <0.5451.0> terminated
with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
2016-02-19 16:25:26.809 [error] <0.5451.0> CRASH REPORT Process <0.5451.0> with
0 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
in gen_server:terminate/6 line 744
Our setup is as follow:
We have a riak cluster with 10 nodes, configuration of each node is as follow:
RAM: 48GB
Disk:
80GB (/)
504GB (separate riak partition)
Riak Version: 2.1.3-1 (2.1.3)
Data in riak: After observing crash, total data in riak partition was ~50GB
Riak config is as follow:
riak.conf
[Attached with this email]
advanced.config:
[
{riak_kv, [{add_paths, ["/usr/local/lib/scale_riak/ebin"]}]},
{webmachine, [{backlog, 511}, {nodelay, true}]},
{yokozuna, [{solr_request_timeout, 120000}]}
].
We have observed this a few times now, and after this crash we observed latency
increases and our application starts timing out.
We would really like to understand what might be causing this crash and if it
is something due to missing config on our nodes we would like to fix it.
Thanks for your help in advanced :-)
Regards,
Raviraj
riak.conf
Description: riak.conf
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
