The behaviour you describe when doing fulls scans, the rate starts high and then slows over time, is similar to what I found. This slow-down happens when the system cache/buffers fill, scan rate is then limited by the timeout/aging rate of the lustre lru locks. If you "dump" the system cache via systctl param "vm.drop_caches" the rate should speed up temporarily (FWIW this will also dump your SQL DB buffers). You may wish to investigate sysctl param "vfs_cache_pressure" and lustre ldlm params "lru_size_ & "lru_max_age".
regards, chris hunter > Date: Sun, 7 Feb 2016 12:00:52 -0700 > From: Nathan Dauchy - NOAA Affiliate<[email protected]> > Subject: Re: [robinhood-support] How to improve changelog processing > rates > > Colin, > > Thanks for your reply. I look forward to hearing what "best practices" you > come up with for improving performance. We have already used the > http://mysqltuner.pl/ script, and matched the server memory to the DB size, > but not done much beyond that. > > That said, I don't think the problem with the system is the database. When > running a full scan, we get "average speed: 3018.87 entries/sec", which I > would have thought is more stressful on the DB than changelog processing > and yet runs several times faster. > > I also just tried another test, since our changelog backlog was so bad and > getting worse, I figure we can pick up the missed bits with a full scan > later. I went ahead and cleared the changelogs. That seemed to let > Robinhood progress much faster, at least for a little while. > > [root@sherwood ~]# lfs changelog_clear lfs3-MDT0000 cl1 0 > > Before the clear: > > 2016/02/04 19:49:21 [38629/1] STATS | read speed = 484.42 > record/sec > 2016/02/04 20:04:21 [38629/1] STATS | read speed = 448.04 > record/sec > 2016/02/04 20:19:21 [38629/1] STATS | read speed = 458.21 > record/sec > 2016/02/04 20:34:21 [38629/1] STATS | read speed = 464.67 > record/sec > 2016/02/04 20:49:21 [38629/1] STATS | read speed = 460.15 > record/sec > 2016/02/04 21:04:21 [38629/1] STATS | read speed = 473.90 > record/sec > > After the clear: > > 2016/02/04 21:19:21 [38629/1] STATS | read speed = > 30047.39 record/sec (14756.61 incl. idle time) > 2016/02/04 21:34:22 [38629/1] STATS | read speed = > 16627.84 record/sec (942.24 incl. idle time) > 2016/02/04 21:49:22 [38629/1] STATS | read speed = 1552.43 > record/sec (1216.07 incl. idle time) > 2016/02/04 22:04:22 [38629/1] STATS | read speed = 2077.57 > record/sec (1041.09 incl. idle time) > 2016/02/04 22:19:22 [38629/1] STATS | read speed = 3987.93 > record/sec (908.36 incl. idle time) > 2016/02/04 22:34:22 [38629/1] STATS | read speed = 2482.72 > record/sec (918.61 incl. idle time) > 2016/02/04 22:49:22 [38629/1] STATS | read speed = 1997.80 > record/sec (854.61 incl. idle time) > 2016/02/04 23:04:22 [38629/1] STATS | read speed = 2259.46 > record/sec (1031.82 incl. idle time) > 2016/02/04 23:19:22 [38629/1] STATS | read speed = 1521.73 > record/sec (953.62 incl. idle time) > 2016/02/04 23:34:22 [38629/1] STATS | read speed = 1321.58 > record/sec (879.58 incl. idle time) > > Current counters: > > [root@lfs-mds-3-1 ~]# cat/proc/fs/lustre/mdd/*MDT*/changelog_users > current index: 591196749 > ID index > cl1 591164109 > > So, a backlog of only 32640 after running for 2.5 hours, which is WAY > better than it was before. > > That sounds a lot likehttps://jira.hpdd.intel.com/browse/LU-5405. So, > just to be sure, I went back and verified our lustre server source... > > [root@lfs-mds-3-1 lustre-2.5.37.ddn1]# grep -n -m 1 -C 2 cfs_list_add > lustre/obdclass/llog_cat.c > 202- > 203- down_write(&cathandle->lgh_lock); > 204: cfs_list_add_tail(&loghandle->u.phd.phd_entry, > &cathandle->u.chd.chd_head); > 205- up_write(&cathandle->lgh_lock); > 206- > > ...and that looks like we DO have the fix. > > > So, is there perhaps another bug here in changelog processing? > > Other tuning that I'm missing? > > Thanks, > Nathan > > > On Thu, Feb 4, 2016 at 1:57 PM, Colin Faber<[email protected]> wrote: > >> Hi Nathan, >> >> I'm actively working on a similar type of benchmarking activities. From >> what I've found so far, as most of my slowdowns are within the database >> processing itself. This can be observed by utilizing db benchmarking tools >> (sysbench with oltp testing). >> >> From what I've found robinhood usually sees double the best case >> performance (transactions per second) on a complex and read only workload >> over a million record set. >> >> These rates are pretty bad, even on beefy hardware, once you've solved >> your database performance problems then you get to move onto issues with >> FID lookup. There are a few tickets (which I don't have off hand) to >> improve this (multiple FID lookups per request, etc), but it's still slow. >> >> In general though changelog reading shouldn't be a bottleneck, even on my >> single processor E5-2609 based system I'm able to read almost 100k records >> / second. >> >> -cf >> >> >> On Thu, Feb 4, 2016 at 1:40 PM, Nathan Dauchy - NOAA Affiliate < >> [email protected]> wrote: >> >>> Greetings Robinhood Developers and Users, >>> >>> It seems as though Robinhood changelog processing is having trouble >>> keeping up on our system. From reading mailing list threads, I believe we >>> should be expecting higher processing rates. However, trials of various >>> suggestions have not yielded much better performance. Does anyone have >>> guidance on how to identify exactly where the bottleneck is, or suggestions >>> on how to speed things up? >>> >>> Lustre servers are running 2.5.37-ddn1 (which*does* appear to have the >>> changelog ordering patch in LU-5405). >>> >>> Robinhood server is on CentOS-6.7, running lustre-2.5.41-ddn3, and using >>> robinhood-tmpfs-2.5.5-2.lustre2.5.el6.x86_64. Backend disk is just >>> internal RAID10 SAS, but the DB pretty much fits in memory. >>> >>> Here is how we are measuring the "backlog": >>> >>> [root@lfs-mds-3-1 ~]# cat/proc/fs/lustre/mdd/*MDT*/changelog_users; >>> sleep 60; cat/proc/fs/lustre/mdd/*MDT*/changelog_users >>> current index: 576107893 >>> ID index >>> cl1 566324338 >>> current index: 576145483 >>> ID index >>> cl1 566354338 >>> >>> 576107893-566324338 = 9783555 >>> 576145483-566354338 = 9791145 >>> So you can see that in this case robinhood is falling behind, though >>> occasional periods of lower user activity let us start to catch up again >>> every once in a while. >>> >>> (566354338-566324338)/60 = 500 >>> So it looks like we are processing just 500 records per second. >>> Strangely, this number is very consistent when re-tested! >>> >>> >>> Here is how the processing rates look from the server side: >>> >>> [root@sherwood ~]# tac /db/robinhood/log/robinhood_lfs3.log | grep -m 1 >>> -B 35 -A 4 "STATS | ChangeLog reader" | tac >>> 2016/02/04 18:37:34 [3291/1] STATS | ==================== Dumping stats >>> at 2016/02/04 18:37:34 ===================== >>> 2016/02/04 18:37:34 [3291/1] STATS | ======== General statistics ========= >>> 2016/02/04 18:37:34 [3291/1] STATS | Daemon start time: 2016/02/01 >>> 19:37:12 >>> 2016/02/04 18:37:34 [3291/1] STATS | Started modules: log_reader >>> 2016/02/04 18:37:34 [3291/1] STATS | ChangeLog reader #0: >>> 2016/02/04 18:37:34 [3291/1] STATS | fs_name = lfs3 >>> 2016/02/04 18:37:34 [3291/1] STATS | mdt_name = MDT0000 >>> 2016/02/04 18:37:34 [3291/1] STATS | reader_id = cl1 >>> 2016/02/04 18:37:34 [3291/1] STATS | records read = 208328153 >>> 2016/02/04 18:37:34 [3291/1] STATS | interesting records = 129868456 >>> 2016/02/04 18:37:34 [3291/1] STATS | suppressed records = 78459697 >>> 2016/02/04 18:37:34 [3291/1] STATS | records pending = 999 >>> 2016/02/04 18:37:34 [3291/1] STATS | last received = >>> 2016/02/04 18:37:34 >>> 2016/02/04 18:37:34 [3291/1] STATS | last read record time = >>> 2016/02/04 17:34:23.559036 >>> 2016/02/04 18:37:34 [3291/1] STATS | last read record id = >>> 564497302 >>> 2016/02/04 18:37:34 [3291/1] STATS | last pushed record id = >>> 564496173 >>> 2016/02/04 18:37:34 [3291/1] STATS | last committed record id = >>> 564485779 >>> 2016/02/04 18:37:34 [3291/1] STATS | last cleared record id = >>> 564481847 >>> 2016/02/04 18:37:34 [3291/1] STATS | read speed = 492.57 >>> record/sec >>> 2016/02/04 18:37:34 [3291/1] STATS | processing speed ratio = 0.33 >>> 2016/02/04 18:37:34 [3291/1] STATS | status = busy >>> 2016/02/04 18:37:34 [3291/1] STATS | ChangeLog stats: >>> 2016/02/04 18:37:34 [3291/1] STATS | MARK: 0, CREAT: 17776256, MKDIR: >>> 788677, HLINK: 3212, SLINK: 2422002, MKNOD: 0, UNLNK: 25982485 >>> 2016/02/04 18:37:34 [3291/1] STATS | RMDIR: 763162, RENME: 6482894, >>> RNMTO: 0, OPEN: 0, CLOSE: 0, LYOUT: 300, TRUNC: 78659450 >>> 2016/02/04 18:37:34 [3291/1] STATS | SATTR: 75397226, XATTR: 26767, >>> HSM: 0, MTIME: 25510, CTIME: 213, ATIME: 0 >>> 2016/02/04 18:37:34 [3291/1] STATS | ==== EntryProcessor Pipeline Stats >>> === >>> 2016/02/04 18:37:34 [3291/1] STATS | Idle threads: 8 >>> 2016/02/04 18:37:34 [3291/1] STATS | Id constraints count: 9168 (hash >>> min=0/max=10/avg=0.6) >>> 2016/02/04 18:37:34 [3291/1] STATS | Name constraints count: 9056 (hash >>> min=0/max=41/avg=0.6) >>> 2016/02/04 18:37:34 [3291/1] STATS | Stage | Wait | Curr | >>> Done | Total | ms/op | >>> 2016/02/04 18:37:34 [3291/1] STATS | 0: GET_FID | 0 | 0 | >>> 0 | 0 | 0.00 | >>> 2016/02/04 18:37:34 [3291/1] STATS | 1: GET_INFO_DB | 81 | 0 | >>> 8130 | 136887093 | 1.63 | >>> 2016/02/04 18:37:34 [3291/1] STATS | 2: GET_INFO_FS | 0 | 0 | >>> 0 | 103122533 | 3.70 | >>> 2016/02/04 18:37:34 [3291/1] STATS | 3: REPORTING | 0 | 0 | >>> 0 | 90558068 | 0.01 | >>> 2016/02/04 18:37:34 [3291/1] STATS | 4: PRE_APPLY | 0 | 0 | >>> 0 | 113372336 | 0.00 | >>> 2016/02/04 18:37:34 [3291/1] STATS | 5: DB_APPLY | 952 | 1 | >>> 3 | 113371383 | 1.19 | 63.30% batched (avg batch size: 8.4) >>> 2016/02/04 18:37:34 [3291/1] STATS | 6: CHGLOG_CLR | 0 | 0 | >>> 0 | 136878007 | 0.01 | >>> 2016/02/04 18:37:34 [3291/1] STATS | 7: RM_OLD_ENTRIES | 0 | 0 | >>> 0 | 0 | 0.00 | >>> 2016/02/04 18:37:34 [3291/1] STATS | DB ops: >>> get=103359170/ins=9377195/upd=81180648/rm=22813326 >>> 2016/02/04 18:37:34 [3291/1] STATS | --- Pipeline stage details --- >>> >>> > <snip> > > ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ robinhood-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/robinhood-support
