Hi --

I'm seeing an issue with timeouts for map/reduces.  We're running erlang files 
via a curl command, as
part of a haskell job.  In the curl data we specify the timeout to be one hour 
(3,600,000 milliseconds --
see the example below).  However, the job crashes (times out) after well less 
than an hour 
(genarally 450-1000 seconds).  See the sample crash below.

Does anyone have an idea or insight on why that might occur?  I've done some 
searching on the riak_kv
code but haven't been able to trace the error through it yet.

A sample mr (similar to this one) has 3398 keys which should map 278012 items 
to be reduced.

We're using eleveldb back end with riak 1.1.1  

One other note, we can be running more than one mr over the same data 
simultaneously.

Thanks for your help!



the input to the curls is something like this:
{
    "inputs" : {
        "bucket" : "data_bucket",
        "index" : "minute_int",
        "start" : 0,
        "end" : 900
    },
    "query" : [
        { "map" : {"language" : "erlang", "module" : "maps", "function" : 
"emitStatsFromList", "keep" : false } },
        { "reduce" : {"language" : "erlang", "module" : "reduces", "function" : 
"reduceStatsList", "keep" : true } }
    ],
    "timeout" : 3600000
}


Here's a (cleansed) crash log:

2012-07-24 08:01:44 =CRASH REPORT====
  crasher:
initial call: riak_pipe_vnode_worker:init/1
pid: <0.28552.254>
registered_name: []
exception exit: 
{timeout,{gen_server,call,[{riak_pipe_vnode_master,'[email protected]'},{return_vnode,{riak_vnode_req_v1,593735040165679310520246963290989976735222595584,{raw,#Ref<11882.0.6211.103965>,<0.28552.254>},{cmd_enqueue,{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},{"dummykey",1},infinity,[{593735040165679310520246963290989976735222595584,'[email protected]'}]}}}]}}
in function  gen_fsm:terminate/7
in call from proc_lib:init_p_do_apply/3
ancestors: [<0.348.0>,<0.347.0>,riak_core_vnode_sup,riak_core_sup,<0.89.0>]
messages: []
links: [<0.348.0>,<0.347.0>]
dictionary: 
[{eunit,[{module,riak_pipe_vnode_worker},{partition,662242929415565384811044689824565743281594433536},{<0.347.0>,<0.347.0>},{details,{fitting_details,{fitting,<11882.27640.38>,#Ref<11882.0.6211.103965>,follow,1},{prereduce,0},riak_kv_w_reduce,{rct,#Fun<reduce_inputs.reduceStatsList.2>,none},{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},[{sink,{fitting,<11882.24969.38>,#Ref<11882.0.6211.103965>,sink,undefined}},{log,sink},{trace,{set,1,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[error],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}}],64}}]}]
trap_exit: false
status: running
heap_size: 317811
stack_size: 24
reductions: 14770819
neighbours:
2012-07-24 08:01:44 =SUPERVISOR REPORT====
Supervisor: {<0.348.0>,riak_pipe_vnode_worker_sup}
Context:    child_terminated
Reason:     
{timeout,{gen_server,call,[{riak_pipe_vnode_master,'[email protected]'},{return_vnode,{riak_vnode_req_v1,593735040165679310520246963290989976735222595584,{raw,#Ref<11882.0.6211.103965>,<0.28552.254>},{cmd_enqueue,{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},{"dummykey",1},infinity,[{593735040165679310520246963290989976735222595584,'[email protected]'}]}}}]}}
Offender:   
[{pid,<0.28552.254>},{name,undefined},{mfargs,{riak_pipe_vnode_worker,start_link,undefined}},{restart_type,temporary},{shutdown,2000},{child_type,worker}]
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to