Greetings Robinhood Developers and Users,
It seems as though Robinhood changelog processing is having trouble keeping
up on our system. From reading mailing list threads, I believe we should
be expecting higher processing rates. However, trials of various
suggestions have not yielded much better performance. Does anyone have
guidance on how to identify exactly where the bottleneck is, or suggestions
on how to speed things up?
Lustre servers are running 2.5.37-ddn1 (which *does* appear to have the
changelog ordering patch in LU-5405).
Robinhood server is on CentOS-6.7, running lustre-2.5.41-ddn3, and using
robinhood-tmpfs-2.5.5-2.lustre2.5.el6.x86_64. Backend disk is just
internal RAID10 SAS, but the DB pretty much fits in memory.
Here is how we are measuring the "backlog":
[root@lfs-mds-3-1 ~]# cat /proc/fs/lustre/mdd/*MDT*/changelog_users; sleep
60; cat /proc/fs/lustre/mdd/*MDT*/changelog_users
current index: 576107893
ID index
cl1 566324338
current index: 576145483
ID index
cl1 566354338
576107893-566324338 = 9783555
576145483-566354338 = 9791145
So you can see that in this case robinhood is falling behind, though
occasional periods of lower user activity let us start to catch up again
every once in a while.
(566354338-566324338)/60 = 500
So it looks like we are processing just 500 records per second. Strangely,
this number is very consistent when re-tested!
Here is how the processing rates look from the server side:
[root@sherwood ~]# tac /db/robinhood/log/robinhood_lfs3.log | grep -m 1 -B
35 -A 4 "STATS | ChangeLog reader" | tac
2016/02/04 18:37:34 [3291/1] STATS | ==================== Dumping stats at
2016/02/04 18:37:34 =====================
2016/02/04 18:37:34 [3291/1] STATS | ======== General statistics =========
2016/02/04 18:37:34 [3291/1] STATS | Daemon start time: 2016/02/01 19:37:12
2016/02/04 18:37:34 [3291/1] STATS | Started modules: log_reader
2016/02/04 18:37:34 [3291/1] STATS | ChangeLog reader #0:
2016/02/04 18:37:34 [3291/1] STATS | fs_name = lfs3
2016/02/04 18:37:34 [3291/1] STATS | mdt_name = MDT0000
2016/02/04 18:37:34 [3291/1] STATS | reader_id = cl1
2016/02/04 18:37:34 [3291/1] STATS | records read = 208328153
2016/02/04 18:37:34 [3291/1] STATS | interesting records = 129868456
2016/02/04 18:37:34 [3291/1] STATS | suppressed records = 78459697
2016/02/04 18:37:34 [3291/1] STATS | records pending = 999
2016/02/04 18:37:34 [3291/1] STATS | last received =
2016/02/04 18:37:34
2016/02/04 18:37:34 [3291/1] STATS | last read record time =
2016/02/04 17:34:23.559036
2016/02/04 18:37:34 [3291/1] STATS | last read record id = 564497302
2016/02/04 18:37:34 [3291/1] STATS | last pushed record id = 564496173
2016/02/04 18:37:34 [3291/1] STATS | last committed record id = 564485779
2016/02/04 18:37:34 [3291/1] STATS | last cleared record id = 564481847
2016/02/04 18:37:34 [3291/1] STATS | read speed = 492.57
record/sec
2016/02/04 18:37:34 [3291/1] STATS | processing speed ratio = 0.33
2016/02/04 18:37:34 [3291/1] STATS | status = busy
2016/02/04 18:37:34 [3291/1] STATS | ChangeLog stats:
2016/02/04 18:37:34 [3291/1] STATS | MARK: 0, CREAT: 17776256, MKDIR:
788677, HLINK: 3212, SLINK: 2422002, MKNOD: 0, UNLNK: 25982485
2016/02/04 18:37:34 [3291/1] STATS | RMDIR: 763162, RENME: 6482894,
RNMTO: 0, OPEN: 0, CLOSE: 0, LYOUT: 300, TRUNC: 78659450
2016/02/04 18:37:34 [3291/1] STATS | SATTR: 75397226, XATTR: 26767, HSM:
0, MTIME: 25510, CTIME: 213, ATIME: 0
2016/02/04 18:37:34 [3291/1] STATS | ==== EntryProcessor Pipeline Stats ===
2016/02/04 18:37:34 [3291/1] STATS | Idle threads: 8
2016/02/04 18:37:34 [3291/1] STATS | Id constraints count: 9168 (hash
min=0/max=10/avg=0.6)
2016/02/04 18:37:34 [3291/1] STATS | Name constraints count: 9056 (hash
min=0/max=41/avg=0.6)
2016/02/04 18:37:34 [3291/1] STATS | Stage | Wait | Curr |
Done | Total | ms/op |
2016/02/04 18:37:34 [3291/1] STATS | 0: GET_FID | 0 | 0 |
0 | 0 | 0.00 |
2016/02/04 18:37:34 [3291/1] STATS | 1: GET_INFO_DB | 81 | 0 |
8130 | 136887093 | 1.63 |
2016/02/04 18:37:34 [3291/1] STATS | 2: GET_INFO_FS | 0 | 0 |
0 | 103122533 | 3.70 |
2016/02/04 18:37:34 [3291/1] STATS | 3: REPORTING | 0 | 0 |
0 | 90558068 | 0.01 |
2016/02/04 18:37:34 [3291/1] STATS | 4: PRE_APPLY | 0 | 0 |
0 | 113372336 | 0.00 |
2016/02/04 18:37:34 [3291/1] STATS | 5: DB_APPLY | 952 | 1 |
3 | 113371383 | 1.19 | 63.30% batched (avg batch size: 8.4)
2016/02/04 18:37:34 [3291/1] STATS | 6: CHGLOG_CLR | 0 | 0 |
0 | 136878007 | 0.01 |
2016/02/04 18:37:34 [3291/1] STATS | 7: RM_OLD_ENTRIES | 0 | 0 |
0 | 0 | 0.00 |
2016/02/04 18:37:34 [3291/1] STATS | DB ops:
get=103359170/ins=9377195/upd=81180648/rm=22813326
2016/02/04 18:37:34 [3291/1] STATS | --- Pipeline stage details ---
[root@sherwood ~]# grep " read speed " /db/robinhood/log/robinhood_lfs3.log
| tail
2016/02/04 16:22:34 [3291/1] STATS | read speed = 1438.38
record/sec
2016/02/04 16:37:34 [3291/1] STATS | read speed = 1074.81
record/sec
2016/02/04 16:52:34 [3291/1] STATS | read speed = 961.80
record/sec (960.73 incl. idle time)
2016/02/04 17:07:34 [3291/1] STATS | read speed = 768.85
record/sec
2016/02/04 17:22:34 [3291/1] STATS | read speed = 841.34
record/sec
2016/02/04 17:37:34 [3291/1] STATS | read speed = 1029.85
record/sec
2016/02/04 17:52:34 [3291/1] STATS | read speed = 948.94
record/sec (947.88 incl. idle time)
2016/02/04 18:07:34 [3291/1] STATS | read speed = 750.86
record/sec
2016/02/04 18:22:34 [3291/1] STATS | read speed = 778.77
record/sec
2016/02/04 18:37:34 [3291/1] STATS | read speed = 492.57
record/sec
NOTE: we can *temporarily* improve the GET_INFO_FS timings by stopping,
unmounting, remounting, and restarting...
[root@sherwood ~]# grep ": GET_INFO_FS "
/db/robinhood/log/robinhood_lfs3.log | tail -n 6
2016/02/04 19:07:34 [3291/1] STATS | 2: GET_INFO_FS | 0 | 0 |
0 | 103164823 | 3.70 |
2016/02/04 19:22:34 [3291/1] STATS | 2: GET_INFO_FS | 0 | 0 |
0 | 103185647 | 3.70 |
2016/02/04 19:30:03 [3291/2] STATS | 2: GET_INFO_FS | 0 | 0 |
0 | 103197626 | 3.70 |
2016/02/04 19:49:21 [38629/1] STATS | 2: GET_INFO_FS | 0 | 0 |
0 | 18409 | 1.87 |
2016/02/04 20:04:21 [38629/1] STATS | 2: GET_INFO_FS | 0 | 0 |
0 | 34587 | 1.96 |
2016/02/04 20:19:21 [38629/1] STATS | 2: GET_INFO_FS | 0 | 0 |
0 | 53473 | 2.19 |
On the other hand, when using an interactive scan, we get more than 3x
better processing rates:
[root@sherwood ~]# rbh-report -a -f lfs3
Using config file '/etc/robinhood.d/tmpfs/lfs3.conf'.
Filesystem scan activity:
Current scan interval: 7.0h
Previous filesystem scan:
start: 2016/02/01 23:28:26
duration: 1h 12min 41s
Last filesystem scan:
status: done
start: 2016/02/02 23:28:01
end: 2016/02/03 11:58:56
duration: 12h 30min 55s
Statistics:
entries scanned: 135773284
errors: 15
timeouts: 0
# threads: 8
average speed: 3018.87 entries/sec
Changelog stats:
Last read record id: 564895829
Last read record time: 2016/02/04 17:37:14.985039
Last receive time: 2016/02/04 18:52:34
Last committed record id: 564886454
Changelog stats:
type total (diff) (rate)
MARK: 0
CREAT: 48660321 (+10803) (12.00/sec)
MKDIR: 2040529 (+354) (0.39/sec)
HLINK: 14134
SLINK: 6563209 (+1327) (1.47/sec)
MKNOD: 0
UNLNK: 64958376 (+340350) (378.17/sec)
RMDIR: 2028347 (+814) (0.90/sec)
RENME: 17569210 (+3426) (3.81/sec)
RNMTO: 0
OPEN: 0
CLOSE: 0
LYOUT: 553
TRUNC: 179191635 (+33158) (36.84/sec)
SATTR: 240069546 (+8296) (9.22/sec)
XATTR: 29642
HSM: 0
MTIME: 59454 (+1) (0.00/sec)
CTIME: 371
ATIME: 0
Storage usage has never been checked
No purge was performed on this filesystem
The filesystem in question is:
[root@sherwood ~]# rbh-report -i -f lfs3
Using config file '/etc/robinhood.d/tmpfs/lfs3.conf'.
type , count, volume, avg_size
symlink , 4819982, 399.90 MB, 87
dir , 5413496, 29.48 GB, 5.71 KB
file , 125761404, 1.04 PB, 8.92 MB
sock , 2, 0, 0
Total: 135994884 entries, 1175890771195111 bytes (1.04 PB)
[root@sherwood ~]# df -h /mnt/lfs3
Filesystem Size Used Avail Use% Mounted on
10.179.16.132@o2ib:10.179.16.131@o2ib:/lfs3
2.1P 1.1P 970T 53% /mnt/lfs3
Some of our current tuning attempts, based on previous recommendations on
this list, are:
* sysctl vm.vfs_cache_pressure = 150
* MySQL config (/etc/my.cnf):
innodb_thread_concurrency=32
innodb_max_dirty_pages_pct=15
max_connections=256
query_cache_size=8M
thread_cache_size=4
innodb_buffer_pool_size=240G (~85% of actual memory)
* Robinhood config:
batch_ack_count = 10000 ;
force_polling = ON ;
polling_interval = 1s ;
queue_max_size = 1000 ; # (could be 2x larger?)
queue_max_age = 5s ; # (could be 2x larger?)
queue_check_interval = 1s ;
* Lustre parameters:
llite.*.statahead_max=4
mdc.*.max_rpcs_in_flight=64
ldlm.namespaces.*osc*.lru_max_age=1200
ldlm.namespaces.*osc*.lru_size=100
* Dropping inode cache daily, with:
echo 2 > /proc/sys/vm/drop_caches # (could be hourly?)
# (would it be better to run "lru_size=clear" hourly?)
* Tuning module "ko2iblnd peer_credits" has NOT yet been attempted.
(unclear on what a good value is)
[root@lfs-mds-3-1 ~]# cat /proc/sys/lnet/peers | egrep
"^(nid|10.179.17.41@o2ib)"
nid refs state last max rtr min tx min
queue
10.179.17.41@o2ib 1 NA -1 8 8 8 8 -91
0
[root@sherwood ~]# cat /proc/sys/lnet/peers | egrep
"^(nid|10.179.16.131@o2ib)"
nid refs state last max rtr min tx min
queue
10.179.16.131@o2ib 1 NA -1 8 8 8 8 -1188
0
Thanks for any suggestions!!!
-Nathan
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support