Hi Ingo, Sorry for the delay in getting back to you.
This looks symptomatic of some of the scheduler issues we fixed of 1.3. A few of the eleveldb issues in the release notes [1] can provide precise details. Is upgrading a possibility? Tweaking your zdbbl in vm.args should alleviate some of the issues with busy buffers but upgrading is probably your best path here. Hope that helps. Keep us posted. Mark [1] https://github.com/basho/riak/blob/master/RELEASE-NOTES.md On Friday, March 15, 2013, Ingo Rockel wrote: > Hi, > > we have a 12 nodes cluster running riak 1.2.1 which went live a week ago. > Yesterday, suddenly from one minute to another the put_fsm_time_95 and the > get_fsm_time_95 raised from something below 100ms up to several seconds. > This went on for about 25 min and than went away. > > Checking the riak-logs of the nodes, I find a lot of these: > > 2013-03-14 17:48:06.388 [info] > <0.62.0>@riak_core_sysmon_**handler:handle_event:89 > Monitor got {suppressed,port_events,1} > 2013-03-14 17:48:06.889 [info] > <0.62.0>@riak_core_sysmon_**handler:handle_event:85 > monitor busy_dist_port <0.7156.1> [{initial_call,{riak_core_** > vnode,init,1}},{almost_**current_function,{erlang,bif_** > return_trap,1}},{message_**queue_len,1}] {#Port<0.9083226>,' > [email protected]'} > > This messages are logged all day, but only once every few minutes but in > the problematic time frame between 17:45 and 18:17 it gets logged several > times every second. The node ip differs though, but it seems only three > nodes were involved. > > Except of these three nodes the cpu utilisation drops by half during this > on all other nodes. On the three nodes there's only a slight drop. > > We are using leveldb as storage backend. I also checked some of the LOG > files of leveldb and there are compactions logged, but these are logged all > the day every few hours. > > In this time our software was quite unresponsive too so I would like to > know what was causing this and what I might do to stop. Any ideas, hints? > > I found this: > > https://groups.google.com/**forum/?fromgroups=#!topic/** > nosql-databases/GqbaeiKCSYE<https://groups.google.com/forum/?fromgroups=#!topic/nosql-databases/GqbaeiKCSYE> > > where Jon Meredith suggests to raise the buffer size to get rid of the > busy buffers by adding +zdbbl 16384 to the vm.args file. Might this help? > > Regards, > > Ingo > -- > Software Architect > > Blue Lion mobile GmbH > Tel. +49 (0) 221 788 797 14 > Fax. +49 (0) 221 788 797 19 > Mob. +49 (0) 176 24 87 30 89 > > [email protected] > >>> qeep: Hefferwolf > > www.bluelionmobile.com > www.qeep.net > > ______________________________**_________________ > riak-users mailing list > [email protected] > http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com<http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com> >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
