Ok thanks Evan.
I didn't post them before on this thread because I thought it was just
"info", but I have *a lot* of messages like these:
2013-03-21 20:13:23.945 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.29558.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.4656154>,'[email protected]'}
2013-03-21 20:13:26.295 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.29046.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.21529382>,'[email protected]'}
2013-03-21 20:13:27.807 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.29046.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{erlang,crc32,2}},{message_queue_len,0}]
{#Port<0.21529382>,'[email protected]'}
2013-03-21 20:13:27.843 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.21168.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.6629407>,'[email protected]'}
2013-03-21 20:13:30.626 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.29558.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.6629407>,'[email protected]'}
2013-03-21 20:13:30.771 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.24361.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.4656154>,'[email protected]'}
2013-03-21 20:13:34.447 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.29558.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.4656154>,'[email protected]'}
2013-03-21 20:13:36.210 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.6726.946>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.4656154>,'[email protected]'}
2013-03-21 20:13:36.501 [info]
<0.12276.176>@riak_core_sysmon_handler:handle_event:85 monitor
busy_dist_port <0.32186.176>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
{#Port<0.6629407>,'[email protected]'}
I guess I have a problem with my network config...
I precise that the servers hosting my Riak cluster are also running
Couchebase, Nginx and Elasticsearch, so a lot of trafic and connections.
/proc/sys/net/netfilter/nf_conntrack_count = 30-100K
--
Godefroy de Compreignac
Eklaweb CEO - www.eklaweb.com
EklaBlog CEO - www.eklablog.com
+33(0)6 11 89 13 84
http://www.linkedin.com/in/godefroy
http://twitter.com/Godefroy
2013/3/21 Evan Vigil-McClanahan <[email protected]>
> It could be a large number of things, unfortunately. To go through
> them all it somewhat outside of my skill set. Maybe someone more
> network savvy can provide some pointers?
>
> Perhaps checking with your network admin, or turning any software
> firewalls on your nodes completely off, as a test?
>
> On Thu, Mar 21, 2013 at 11:44 AM, Godefroy de Compreignac
> <[email protected]> wrote:
> > But I don't understand what could stop transfers. Maybe a kernel setting?
> > How could I find out?
> >
> > --
> > Godefroy de Compreignac
> >
> > Eklaweb CEO - www.eklaweb.com
> > EklaBlog CEO - www.eklablog.com
> >
> > +33(0)6 11 89 13 84
> > http://www.linkedin.com/in/godefroy
> > http://twitter.com/Godefroy
> >
> >
> > 2013/3/21 Evan Vigil-McClanahan <[email protected]>
> >>
> >> Handoff is done by default on port 8099.
> >>
> >> I guess what I am getting at here is that this doesn't look like an
> >> obvious riak problem, it's more likely that something on your network
> >> or on your nodes is closing or interrupting those sockets; you'd most
> >> likely get a different error if something internal to riak was causing
> >> the transfers to fail.
> >>
> >> On Thu, Mar 21, 2013 at 10:09 AM, Godefroy de Compreignac
> >> <[email protected]> wrote:
> >> > The only limitation that I'd see is Haproy which have a time limit:
> >> > contimeout 5000
> >> > clitimeout 50000
> >> > srvtimeout 3600000
> >> >
> >> > But Haproxy serves Riak on port 8098 and I configured Riak to use port
> >> > 8097:
> >> > {pb_port, 8087 }
> >> > {http, [ {"5.39.68.152", 8097 } ]}
> >> >
> >> > So I guess Riak use only port 8097 internally, without any limitation.
> >> >
> >> > And by checking logs, I see that a vnode transfer fails after a random
> >> > duration, sometimes a few minutes.
> >
> >
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com