Hello John,

I'm not really an expert but looking at your crash log my first guess is that 
an error occurs in the reduce part of the map/reduce of the job itself.
More specifically I think you need to examine the meaning of this bit in your 
crash report:

> {details,{fitting_details,{fitting,<11882.27640.38>,#Ref<11882.0.6211.103965>,follow,1},{prereduce,0},riak_kv_w_reduce,{rct,#Fun<reduce_inputs.reduceStatsList.2>,none},{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},[{sink,{fitting,<11882.24969.38>,#Ref<11882.0.6211.103965>,sink,undefined}},{log,sink},{trace,{set,1,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[error],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}}],64}}]}]


But like I said, it's just a hunch.

Cheers,
Erik Hoogeveen

On 25 jul 2012, at 20:11, John Roy wrote:

> Hi --
> 
> I'm seeing an issue with timeouts for map/reduces.  We're running erlang 
> files via a curl command, as
> part of a haskell job.  In the curl data we specify the timeout to be one 
> hour (3,600,000 milliseconds --
> see the example below).  However, the job crashes (times out) after well less 
> than an hour 
> (genarally 450-1000 seconds).  See the sample crash below.
> 
> Does anyone have an idea or insight on why that might occur?  I've done some 
> searching on the riak_kv
> code but haven't been able to trace the error through it yet.
> 
> A sample mr (similar to this one) has 3398 keys which should map 278012 items 
> to be reduced.
> 
> We're using eleveldb back end with riak 1.1.1  
> 
> One other note, we can be running more than one mr over the same data 
> simultaneously.
> 
> Thanks for your help!
> 
> 
> 
> the input to the curls is something like this:
> {
>     "inputs" : {
>         "bucket" : "data_bucket",
>         "index" : "minute_int",
>         "start" : 0,
>         "end" : 900
>     },
>     "query" : [
>         { "map" : {"language" : "erlang", "module" : "maps", "function" : 
> "emitStatsFromList", "keep" : false } },
>         { "reduce" : {"language" : "erlang", "module" : "reduces", "function" 
> : "reduceStatsList", "keep" : true } }
>     ],
>     "timeout" : 3600000
> }
> 
> 
> Here's a (cleansed) crash log:
> 
> 2012-07-24 08:01:44 =CRASH REPORT====
>   crasher:
> initial call: riak_pipe_vnode_worker:init/1
> pid: <0.28552.254>
> registered_name: []
> exception exit: 
> {timeout,{gen_server,call,[{riak_pipe_vnode_master,'[email protected]'},{return_vnode,{riak_vnode_req_v1,593735040165679310520246963290989976735222595584,{raw,#Ref<11882.0.6211.103965>,<0.28552.254>},{cmd_enqueue,{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},{"dummykey",1},infinity,[{593735040165679310520246963290989976735222595584,'[email protected]'}]}}}]}}
> in function  gen_fsm:terminate/7
> in call from proc_lib:init_p_do_apply/3
> ancestors: [<0.348.0>,<0.347.0>,riak_core_vnode_sup,riak_core_sup,<0.89.0>]
> messages: []
> links: [<0.348.0>,<0.347.0>]
> dictionary: 
> [{eunit,[{module,riak_pipe_vnode_worker},{partition,662242929415565384811044689824565743281594433536},{<0.347.0>,<0.347.0>},{details,{fitting_details,{fitting,<11882.27640.38>,#Ref<11882.0.6211.103965>,follow,1},{prereduce,0},riak_kv_w_reduce,{rct,#Fun<reduce_inputs.reduceStatsList.2>,none},{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},[{sink,{fitting,<11882.24969.38>,#Ref<11882.0.6211.103965>,sink,undefined}},{log,sink},{trace,{set,1,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[error],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}}],64}}]}]
> trap_exit: false
> status: running
> heap_size: 317811
> stack_size: 24
> reductions: 14770819
> neighbours:
> 2012-07-24 08:01:44 =SUPERVISOR REPORT====
> Supervisor: {<0.348.0>,riak_pipe_vnode_worker_sup}
> Context:    child_terminated
> Reason:     
> {timeout,{gen_server,call,[{riak_pipe_vnode_master,'[email protected]'},{return_vnode,{riak_vnode_req_v1,593735040165679310520246963290989976735222595584,{raw,#Ref<11882.0.6211.103965>,<0.28552.254>},{cmd_enqueue,{fitting,<11882.27639.38>,#Ref<11882.0.6211.103965>,<<103,28,147,16,123,67,248,114,104,204,9,54,33,62,81,41,129,84,203,83>>,1},{"dummykey",1},infinity,[{593735040165679310520246963290989976735222595584,'[email protected]'}]}}}]}}
> Offender:   
> [{pid,<0.28552.254>},{name,undefined},{mfargs,{riak_pipe_vnode_worker,start_link,undefined}},{restart_type,temporary},{shutdown,2000},{child_type,worker}]
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to