Hi --
I'm seeing an issue with timeouts for map/reduces. We're running erlang files
via a curl command, as
part of a haskell job. In the curl data we specify the timeout to be one hour
(3,600,000 milliseconds --
see the example below). However, the job crashes (times out) after well less
than an hour
(genarally 450-1000 seconds). See the sample crash below.
Does anyone have an idea or insight on why that might occur? I've done some
searching on the riak_kv
code but haven't been able to trace the error through it yet.
A sample mr (similar to this one) has 3398 keys which should map 278012 items
to be reduced.
We're using eleveldb back end with riak 1.1.1
One other note, we can be running more than one mr over the same data
simultaneously.
Thanks for your help!
the input to the curls is something like this:
{
"inputs" : {
"bucket" : "data_bucket",
"index" : "minute_int",
"start" : 0,
"end" : 900
},
"query" : [
{ "map" : {"language" : "erlang", "module" : "maps", "function" :
"emitStatsFromList", "keep" : false } },
{ "reduce" : {"language" : "erlang", "module" : "reduces", "function" :
"reduceStatsList", "keep" : true } }
],
"timeout" : 3600000
}
Here's a (cleansed) crash log:
2012-07-24 08:01:44 =CRASH REPORT====
crasher:
initial call: riak_pipe_vnode_worker:init/1
pid: <0.28552.254>
registered_name: []
exception exit:
{timeout,{gen_server,call,[{riak_pipe_vnode_master,'[email protected]'},{return_vnode,{riak_vnode_req_v1,593735040165679310520246963290989976735222595584,{raw,#Ref<11882.0.6211.103965>,<0.28552.254>},{cmd_enqueue,{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},{"dummykey",1},infinity,[{593735040165679310520246963290989976735222595584,'[email protected]'}]}}}]}}
in function gen_fsm:terminate/7
in call from proc_lib:init_p_do_apply/3
ancestors: [<0.348.0>,<0.347.0>,riak_core_vnode_sup,riak_core_sup,<0.89.0>]
messages: []
links: [<0.348.0>,<0.347.0>]
dictionary:
[{eunit,[{module,riak_pipe_vnode_worker},{partition,662242929415565384811044689824565743281594433536},{<0.347.0>,<0.347.0>},{details,{fitting_details,{fitting,<11882.27640.38>,#Ref<11882.0.6211.103965>,follow,1},{prereduce,0},riak_kv_w_reduce,{rct,#Fun<reduce_inputs.reduceStatsList.2>,none},{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},[{sink,{fitting,<11882.24969.38>,#Ref<11882.0.6211.103965>,sink,undefined}},{log,sink},{trace,{set,1,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[error],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}}],64}}]}]
trap_exit: false
status: running
heap_size: 317811
stack_size: 24
reductions: 14770819
neighbours:
2012-07-24 08:01:44 =SUPERVISOR REPORT====
Supervisor: {<0.348.0>,riak_pipe_vnode_worker_sup}
Context: child_terminated
Reason:
{timeout,{gen_server,call,[{riak_pipe_vnode_master,'[email protected]'},{return_vnode,{riak_vnode_req_v1,593735040165679310520246963290989976735222595584,{raw,#Ref<11882.0.6211.103965>,<0.28552.254>},{cmd_enqueue,{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},{"dummykey",1},infinity,[{593735040165679310520246963290989976735222595584,'[email protected]'}]}}}]}}
Offender:
[{pid,<0.28552.254>},{name,undefined},{mfargs,{riak_pipe_vnode_worker,start_link,undefined}},{restart_type,temporary},{shutdown,2000},{child_type,worker}]_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com