Ingo,
I have two guesses that might explain the symptoms:
- there is a bad drive in one the nodes, or
- one or more nodes begins to use swap space during a compaction or 2i
iteration.
I might be able to describe / isolate the problem by examining the "LOG" files
produced by leveldb. Would you consider gathering these files from each of the
12 nodes and emailing them directly to my address?
Command to gather:
sort /var/lib/riak/leveldb/*/LOG* >~/nodename.LOG
gzip nodename.LOG
Change "/var/lib/riak" to your data_root path that is set in eleveldb section
of app.config. And give each machine a distinct name instead of "nodename".
Also, please paste the output of one node's /proc/meminfo into the email.
("cat /proc/meminfo")
Matthew
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com