riak (1.3.1), increasing response times over time

Stefan Mees Fri, 31 May 2013 02:02:51 -0700

Hello,

We have a suspicious behaivior in our riak cluster (1.3.1)), the response
times increasing every day the riak node is running (especially the put
times)... after a restart of a node everything is fine for ~3-5 days, then
we seeing increasing response times again.


Dataset size and throughput are the same for the restarted and running
node, also the number of compactions per node (we are using bitcask) is
roughly the same. The dataset fits completely in ram (so we have no read io
at all, write io also not saturates the disk).

This is how it looks like http://puu.sh/3514h.png (riak06 was restarted at
~12 o'clock, node03 is running since 8 days),

This is the mean respone time graph of both nodes
http://puu.sh/350Qt.png(the spikes on riak03 looking suspiciously and
follow a pattern).

riak03 (Local node cluster_info dump)
https://gist.github.com/stefan-mees/81ef36c83358ff7d9754/raw/1b33bde80bd0185666c923c1ac35729b18b0cef7/riak03+

riak06 (local node cluster_info dump)
https://gist.github.com/stefan-mees/81ef36c83358ff7d9754/raw/90676383b562b9d9298f59f3d9741bdc8f926271/gistfile1.txt

On riak03 (running node since x days), one process consuming a lot of
memory in comparison to riak06 (could this be a gc issue? process is
bitcask_merge_delete).

Also CPU and Network i/o is not saturated on both nodes, so we believe this
is somehow a erlang/riak issue. We would be really thankful for any help.

/stefan

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

riak (1.3.1), increasing response times over time

Reply via email to