I have a 7 node Riak cluster. Most nodes are on 1.4.10 and I am currently
migrating the nodes from RedHat to Ubuntu (and Ext4 -> ZFS).
One of the nodes is running Ubuntu 14 with ZFS for the data and has Riak 1.4.12
installed. On this node Riak is continuously reading and writing a lot of data.
Even hours after it is started. There are no running transfers (riak-admin
transfers). The other nodes are not very busy, almost no CPU, no I/O. For this
one node see the output of iotop, uname -a, riak-admin status and some recent
logging below.
There is nothing obvious in the Riak logging, except that there is an
occasional reading/writing large object (once a minute or so, objects of around
5Mb-10Mb).
Is there a way to see what Riak is doing, what it is reading/writing?
Thanks,
Timo
Total DISK READ : 81.99 M/s | Total DISK WRITE : 33.33 M/s
Actual DISK READ: 39.59 M/s | Actual DISK WRITE: 241.72 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
8723 be/4 riak 23.82 M/s 12.05 M/s 0.00 % 80.54 % beam.smp -K true -A
64~ -smp enable -- console
8725 be/4 riak 4.80 M/s 814.60 B/s 0.00 % 72.67 % beam.smp -K true -A
64~ -smp enable -- console
8657 be/4 riak 3.83 M/s 814.60 B/s 0.00 % 68.98 % beam.smp -K true -A
64~ -smp enable -- console
698 be/0 root 0.00 B/s 0.00 B/s 0.00 % 47.67 % [txg_sync]
8724 be/4 riak 13.07 M/s 7.68 M/s 0.00 % 39.04 % beam.smp -K true -A
64~ -smp enable -- console
8726 be/4 riak 50.91 K/s 13.59 M/s 0.00 % 36.77 % beam.smp -K true -A
64~ -smp enable -- console
7917 be/4 root 51.11 K/s 0.00 B/s 0.00 % 4.26 % [kworker/u16:5]
234 be/3 root 0.00 B/s 1629.19 B/s 0.00 % 0.01 % [jbd2/sda6-8]
8302 be/4 riak 0.00 B/s 814.60 B/s 0.00 % 0.00 % run_erl -daemon
/tmp/r~ /usr/sbin/riak console
9942 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/u16:6]
10319 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/u16:0]
10320 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/u16:1]
8401 be/4 riak 0.00 B/s 814.60 B/s 0.00 % 0.00 % beam.smp -K true -A
#uname -a
Linux tx4 3.13.0-49-generic #83-Ubuntu SMP Fri Apr 10 20:11:33 UTC 2015 x86_64
x86_64 x86_64 GNU/Linux
sys_system_version : <<"Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:8:8]
[async-threads:64] [kernel-poll:true]">>
sys_system_architecture : <<"x86_64-unknown-linux-gnu">>
sys_threads_enabled : true
sys_thread_pool_size : 64
sys_wordsize : 8
ring_num_partitions : 64
ring_creation_size : 64
storage_backend : riak_kv_multi_backend
erlydtl_version : <<"0.7.0">>
riak_control_version : <<"1.4.12-0-g964c5db">>
cluster_info_version : <<"1.2.4">>
riak_search_version : <<"1.4.12-0-g7fe0e00">>
merge_index_version : <<"1.3.2-0-gcb38ee7">>
riak_kv_version : <<"1.4.12-0-gc6bbd66">>
1-minute stats for ...
-------------------------------------------
riak_kv_stat_ts : 1432301431
vnode_gets : 748
vnode_gets_total : 10051
vnode_puts : 182
vnode_puts_total : 3690
vnode_index_refreshes : 0
vnode_index_refreshes_total : 0
vnode_index_reads : 10
vnode_index_reads_total : 100
vnode_index_writes : 182
vnode_index_writes_total : 3686
vnode_index_writes_postings : 2
vnode_index_writes_postings_total : 139
vnode_index_deletes : 0
vnode_index_deletes_total : 0
vnode_index_deletes_postings : 0
vnode_index_deletes_postings_total : 13
node_gets : 0
node_gets_total : 0
node_get_fsm_siblings_mean : 0
node_get_fsm_siblings_median : 0
node_get_fsm_siblings_95 : 0
node_get_fsm_siblings_99 : 0
node_get_fsm_siblings_100 : 0
node_get_fsm_objsize_mean : 0
node_get_fsm_objsize_median : 0
node_get_fsm_objsize_95 : 0
node_get_fsm_objsize_99 : 0
node_get_fsm_objsize_100 : 0
node_get_fsm_time_mean : 0
node_get_fsm_time_median : 0
node_get_fsm_time_95 : 0
node_get_fsm_time_99 : 0
node_get_fsm_time_100 : 0
node_puts : 64
node_puts_total : 1196
node_put_fsm_time_mean : 691827
node_put_fsm_time_median : 2168
node_put_fsm_time_95 : 4415096
node_put_fsm_time_99 : 8226455
node_put_fsm_time_100 : 12045311
read_repairs : 0
read_repairs_total : 0
coord_redirs_total : 0
==> /var/log/riak/erlang.log.1 <==
08:41:19.235 [info] monitor long_gc <0.7.0>
[{name,application_controller},{initial_call,{application_controller,start,1}},{almost_current_function,{gen_server,loop,6}},{message_queue_len,0}]
[{timeout,293},{old_heap_block_size,0},{heap_block_size,6765},{mbuf_size,0},{stack_size,23},{old_heap_size,0},{heap_size,2142}]
08:41:19.437 [warning] Writing very large object (19730102 bytes) to <<...>>
==> /var/log/riak/console.log <==
2015-05-22 08:41:19.235 [info]
<0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.7.0>
[{name,application_controller},{initial_call,{application_controller,start,1}},{almost_current_function,{gen_server,loop,6}},{message_queue_len,0}]
[{timeout,293},{old_heap_block_size,0},{heap_block_size,6765},{mbuf_size,0},{stack_size,23},{old_heap_size,0},{heap_size,2142}]
2015-05-22 08:41:19.437 [warning]
<0.710.0>@riak_kv_vnode:encode_and_put_no_sib_check:1900 Writing very large
object (19730102 bytes) to <<...>>
==> /var/log/riak/erlang.log.1 <==
08:41:23.417 [info] monitor long_gc <0.436.0>
[{name,proxy_riak_kv_vnode_68507889249886074290797726533575766546371837952},{initial_call,{riak_core_vnode_proxy,init,1}},{almost_current_function,{riak_core_vnode_proxy,loop,2}},{message_queue_len,0}]
[{timeout,2093},{old_heap_block_size,0},{heap_block_size,987},{mbuf_size,0},{stack_size,5},{old_heap_size,0},{heap_size,481}]
==> /var/log/riak/console.log <==
2015-05-22 08:41:23.417 [info]
<0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.436.0>
[{name,proxy_riak_kv_vnode_68507889249886074290797726533575766546371837952},{initial_call,{riak_core_vnode_proxy,init,1}},{almost_current_function,{riak_core_vnode_proxy,loop,2}},{message_queue_len,0}]
[{timeout,2093},{old_heap_block_size,0},{heap_block_size,987},{mbuf_size,0},{stack_size,5},{old_heap_size,0},{heap_size,481}]
==> /var/log/riak/erlang.log.1 <==
08:41:25.319 [warning] Reading large object of size 12178287 from <<...>>
==> /var/log/riak/console.log <==
2015-05-22 08:41:25.319 [warning] <0.707.0>@riak_kv_vnode:do_get_object:1365
Reading large object of size 12178287 from <<..">>
==> /var/log/riak/erlang.log.1 <==
08:42:00.687 [info] monitor long_gc <0.707.0>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,7}]
[{timeout,485},{old_heap_block_size,0},{heap_block_size,2584},{mbuf_size,0},{stack_size,46},{old_heap_size,0},{heap_size,909}]
==> /var/log/riak/console.log <==
2015-05-22 08:42:00.687 [info]
<0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.707.0>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,7}]
[{timeout,485},{old_heap_block_size,0},{heap_block_size,2584},{mbuf_size,0},{stack_size,46},{old_heap_size,0},{heap_size,909}]
==> /var/log/riak/erlang.log.1 <==
08:42:40.036 [info] monitor long_gc <0.712.0>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,write,3}},{message_queue_len,0}]
[{timeout,759},{old_heap_block_size,0},{heap_block_size,4181},{mbuf_size,0},{stack_size,73},{old_heap_size,0},{heap_size,1683}]
==> /var/log/riak/console.log <==
2015-05-22 08:42:40.036 [info]
<0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.712.0>
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,write,3}},{message_queue_len,0}]
[{timeout,759},{old_heap_block_size,0},{heap_block_size,4181},{mbuf_size,0},{stack_size,73},{old_heap_size,0},{heap_size,1683}]
Some of the non-standard /etc/riak/vm.args:
## Increase number of concurrent ports/sockets
-env ERL_MAX_PORTS 64000
## Tweak GC to run more often
-env ERL_FULLSWEEP_AFTER 0
## Set the location of crash dumps
-env ERL_CRASH_DUMP /var/log/riak/erl_crash.dump
## Raise the ETS table limit
-env ERL_MAX_ETS_TABLES 256000
## Force the erlang VM to use SMP
-smp enable
## For nodes with many busy_dist_port events, Basho recommends
## raising the sender-side network distribution buffer size.
## 32MB may not be sufficient for some workloads and is a suggested
## starting point.
## The Erlang/OTP default is 1024 (1 megabyte).
## See: http://www.erlang.org/doc/man/erl.html#%2bzdbbl
+zdbbl 32768
## Raise the default erlang process limit
+P 256000
Some of the non-standard /etc/riak/app.config:
%% -*- tab-width: 4;erlang-indent-level: 4;indent-tabs-mode: nil -*-
%% ex: ts=4 sw=4 et
[
%% For firewall config limit ports
{ kernel, [
{inet_dist_listen_min, 6000},
{inet_dist_listen_max, 7999}
]},
%% Riak Client APIs config
%% riak_handoff_port is the TCP port that Riak uses for
%% intra-cluster data handoff.
{handoff_port, 8099 },
{handoff_concurrency, 3},
%% Riak KV config
{riak_kv, [
%% Storage_backend specifies the Erlang module defining the storage
%% mechanism that will be used on this node.
{storage_backend, riak_kv_multi_backend},
{multi_backend_default, <<"eleveldb">>},
{multi_backend, [
{<<"eleveldb">>, riak_kv_eleveldb_backend, [
{data_root, "/data/riak/leveldb"}
]},
{<<"fs">>, riak_kv_eleveldb_backend, [
{data_root, "/bigdata/riak/bigleveldb"}
]},
{<<"cache">>, riak_kv_memory_backend, [
{max_memory, 16}, %% 16Mb
{ttl, 86400} %% 1 Day in seconds
]}
]},
{anti_entropy, {off, []}},
{vnode_vclocks, true},
{legacy_keylisting, false},
{listkeys_backpressure, true},
{legacy_vnode_routing, false},
{fsm_limit, 50000},
{object_format, v1}
]},
{riak_search, [
%% To enable Search functionality set this 'true'.
{enabled, false}
]},
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com