Hello, We have a suspicious behaivior in our riak cluster (1.3.1)), the response times increasing every day the riak node is running (especially the put times)... after a restart of a node everything is fine for ~3-5 days, then we seeing increasing response times again.
Dataset size and throughput are the same for the restarted and running node, also the number of compactions per node (we are using bitcask) is roughly the same. The dataset fits completely in ram (so we have no read io at all, write io also not saturates the disk). This is how it looks like http://puu.sh/3514h.png (riak06 was restarted at ~12 o'clock, node03 is running since 8 days), This is the mean respone time graph of both nodes http://puu.sh/350Qt.png(the spikes on riak03 looking suspiciously and follow a pattern). riak03 (Local node cluster_info dump) https://gist.github.com/stefan-mees/81ef36c83358ff7d9754/raw/1b33bde80bd0185666c923c1ac35729b18b0cef7/riak03+ riak06 (local node cluster_info dump) https://gist.github.com/stefan-mees/81ef36c83358ff7d9754/raw/90676383b562b9d9298f59f3d9741bdc8f926271/gistfile1.txt On riak03 (running node since x days), one process consuming a lot of memory in comparison to riak06 (could this be a gc issue? process is bitcask_merge_delete). Also CPU and Network i/o is not saturated on both nodes, so we believe this is somehow a erlang/riak issue. We would be really thankful for any help. /stefan
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
