[robinhood-support] How to improve changelog processing rates

Nathan Dauchy - NOAA Affiliate Thu, 04 Feb 2016 12:41:26 -0800

Greetings Robinhood Developers and Users,

It seems as though Robinhood changelog processing is having trouble keeping
up on our system.  From reading mailing list threads, I believe we should
be expecting higher processing rates.  However, trials of various
suggestions have not yielded much better performance.  Does anyone have
guidance on how to identify exactly where the bottleneck is, or suggestions
on how to speed things up?


Lustre servers are running 2.5.37-ddn1 (which *does* appear to have the
changelog ordering patch in LU-5405).

Robinhood server is on CentOS-6.7, running lustre-2.5.41-ddn3, and using
robinhood-tmpfs-2.5.5-2.lustre2.5.el6.x86_64.  Backend disk is just
internal RAID10 SAS, but the DB pretty much fits in memory.

Here is how we are measuring the "backlog":

[root@lfs-mds-3-1 ~]# cat /proc/fs/lustre/mdd/*MDT*/changelog_users; sleep
60; cat /proc/fs/lustre/mdd/*MDT*/changelog_users
current index: 576107893
ID    index
cl1   566324338
current index: 576145483
ID    index
cl1   566354338

576107893-566324338 = 9783555
576145483-566354338 = 9791145
So you can see that in this case robinhood is falling behind, though
occasional periods of lower user activity let us start to catch up again
every once in a while.

(566354338-566324338)/60 = 500
So it looks like we are processing just 500 records per second.  Strangely,
this number is very consistent when re-tested!


Here is how the processing rates look from the server side:

[root@sherwood ~]# tac /db/robinhood/log/robinhood_lfs3.log | grep -m 1 -B
35 -A 4 "STATS | ChangeLog reader" | tac
2016/02/04 18:37:34 [3291/1] STATS | ==================== Dumping stats at
2016/02/04 18:37:34 =====================
2016/02/04 18:37:34 [3291/1] STATS | ======== General statistics =========
2016/02/04 18:37:34 [3291/1] STATS | Daemon start time: 2016/02/01 19:37:12
2016/02/04 18:37:34 [3291/1] STATS | Started modules: log_reader
2016/02/04 18:37:34 [3291/1] STATS | ChangeLog reader #0:
2016/02/04 18:37:34 [3291/1] STATS |    fs_name    =   lfs3
2016/02/04 18:37:34 [3291/1] STATS |    mdt_name   =   MDT0000
2016/02/04 18:37:34 [3291/1] STATS |    reader_id  =   cl1
2016/02/04 18:37:34 [3291/1] STATS |    records read        = 208328153
2016/02/04 18:37:34 [3291/1] STATS |    interesting records = 129868456
2016/02/04 18:37:34 [3291/1] STATS |    suppressed records  = 78459697
2016/02/04 18:37:34 [3291/1] STATS |    records pending     = 999
2016/02/04 18:37:34 [3291/1] STATS |    last received            =
2016/02/04 18:37:34
2016/02/04 18:37:34 [3291/1] STATS |    last read record time    =
2016/02/04 17:34:23.559036
2016/02/04 18:37:34 [3291/1] STATS |    last read record id      = 564497302
2016/02/04 18:37:34 [3291/1] STATS |    last pushed record id    = 564496173
2016/02/04 18:37:34 [3291/1] STATS |    last committed record id = 564485779
2016/02/04 18:37:34 [3291/1] STATS |    last cleared record id   = 564481847
2016/02/04 18:37:34 [3291/1] STATS |    read speed               = 492.57
record/sec
2016/02/04 18:37:34 [3291/1] STATS |    processing speed ratio   = 0.33
2016/02/04 18:37:34 [3291/1] STATS |    status                   = busy
2016/02/04 18:37:34 [3291/1] STATS |    ChangeLog stats:
2016/02/04 18:37:34 [3291/1] STATS |    MARK: 0, CREAT: 17776256, MKDIR:
788677, HLINK: 3212, SLINK: 2422002, MKNOD: 0, UNLNK: 25982485
2016/02/04 18:37:34 [3291/1] STATS |    RMDIR: 763162, RENME: 6482894,
RNMTO: 0, OPEN: 0, CLOSE: 0, LYOUT: 300, TRUNC: 78659450
2016/02/04 18:37:34 [3291/1] STATS |    SATTR: 75397226, XATTR: 26767, HSM:
0, MTIME: 25510, CTIME: 213, ATIME: 0
2016/02/04 18:37:34 [3291/1] STATS | ==== EntryProcessor Pipeline Stats ===
2016/02/04 18:37:34 [3291/1] STATS | Idle threads: 8
2016/02/04 18:37:34 [3291/1] STATS | Id constraints count: 9168 (hash
min=0/max=10/avg=0.6)
2016/02/04 18:37:34 [3291/1] STATS | Name constraints count: 9056 (hash
min=0/max=41/avg=0.6)
2016/02/04 18:37:34 [3291/1] STATS | Stage              | Wait | Curr |
Done |     Total | ms/op |
2016/02/04 18:37:34 [3291/1] STATS |  0: GET_FID        |    0 |    0 |
 0 |         0 |  0.00 |
2016/02/04 18:37:34 [3291/1] STATS |  1: GET_INFO_DB    |   81 |    0 |
8130 | 136887093 |  1.63 |
2016/02/04 18:37:34 [3291/1] STATS |  2: GET_INFO_FS    |    0 |    0 |
 0 | 103122533 |  3.70 |
2016/02/04 18:37:34 [3291/1] STATS |  3: REPORTING      |    0 |    0 |
 0 |  90558068 |  0.01 |
2016/02/04 18:37:34 [3291/1] STATS |  4: PRE_APPLY      |    0 |    0 |
 0 | 113372336 |  0.00 |
2016/02/04 18:37:34 [3291/1] STATS |  5: DB_APPLY       |  952 |    1 |
 3 | 113371383 |  1.19 | 63.30% batched (avg batch size: 8.4)
2016/02/04 18:37:34 [3291/1] STATS |  6: CHGLOG_CLR     |    0 |    0 |
 0 | 136878007 |  0.01 |
2016/02/04 18:37:34 [3291/1] STATS |  7: RM_OLD_ENTRIES |    0 |    0 |
 0 |         0 |  0.00 |
2016/02/04 18:37:34 [3291/1] STATS | DB ops:
get=103359170/ins=9377195/upd=81180648/rm=22813326
2016/02/04 18:37:34 [3291/1] STATS | --- Pipeline stage details ---

[root@sherwood ~]# grep " read speed " /db/robinhood/log/robinhood_lfs3.log
| tail
2016/02/04 16:22:34 [3291/1] STATS |    read speed               = 1438.38
record/sec
2016/02/04 16:37:34 [3291/1] STATS |    read speed               = 1074.81
record/sec
2016/02/04 16:52:34 [3291/1] STATS |    read speed               = 961.80
record/sec (960.73 incl. idle time)
2016/02/04 17:07:34 [3291/1] STATS |    read speed               = 768.85
record/sec
2016/02/04 17:22:34 [3291/1] STATS |    read speed               = 841.34
record/sec
2016/02/04 17:37:34 [3291/1] STATS |    read speed               = 1029.85
record/sec
2016/02/04 17:52:34 [3291/1] STATS |    read speed               = 948.94
record/sec (947.88 incl. idle time)
2016/02/04 18:07:34 [3291/1] STATS |    read speed               = 750.86
record/sec
2016/02/04 18:22:34 [3291/1] STATS |    read speed               = 778.77
record/sec
2016/02/04 18:37:34 [3291/1] STATS |    read speed               = 492.57
record/sec

NOTE: we can *temporarily* improve the GET_INFO_FS timings by stopping,
unmounting, remounting, and restarting...

[root@sherwood ~]# grep ": GET_INFO_FS "
/db/robinhood/log/robinhood_lfs3.log | tail -n 6
2016/02/04 19:07:34 [3291/1] STATS |  2: GET_INFO_FS    |    0 |    0 |
 0 | 103164823 |  3.70 |
2016/02/04 19:22:34 [3291/1] STATS |  2: GET_INFO_FS    |    0 |    0 |
 0 | 103185647 |  3.70 |
2016/02/04 19:30:03 [3291/2] STATS |  2: GET_INFO_FS    |    0 |    0 |
 0 | 103197626 |  3.70 |
2016/02/04 19:49:21 [38629/1] STATS |  2: GET_INFO_FS    |    0 |    0 |
 0 |     18409 |  1.87 |
2016/02/04 20:04:21 [38629/1] STATS |  2: GET_INFO_FS    |    0 |    0 |
 0 |     34587 |  1.96 |
2016/02/04 20:19:21 [38629/1] STATS |  2: GET_INFO_FS    |    0 |    0 |
 0 |     53473 |  2.19 |


On the other hand, when using an interactive scan, we get more than 3x
better processing rates:

[root@sherwood ~]# rbh-report -a -f lfs3
Using config file '/etc/robinhood.d/tmpfs/lfs3.conf'.

Filesystem scan activity:

    Current scan interval:   7.0h

    Previous filesystem scan:
            start:           2016/02/01 23:28:26
            duration:        1h 12min 41s

    Last filesystem scan:
            status:          done
            start:           2016/02/02 23:28:01
            end:             2016/02/03 11:58:56
            duration:        12h 30min 55s

         Statistics:
            entries scanned: 135773284
            errors:          15
            timeouts:        0
            # threads:       8
            average speed:   3018.87 entries/sec

Changelog stats:

        Last read record id:      564895829
        Last read record time:    2016/02/04 17:37:14.985039
        Last receive time:        2016/02/04 18:52:34
        Last committed record id: 564886454
        Changelog stats:
                 type            total (diff) (rate)
                 MARK:               0
                CREAT:        48660321 (+10803) (12.00/sec)
                MKDIR:         2040529 (+354) (0.39/sec)
                HLINK:           14134
                SLINK:         6563209 (+1327) (1.47/sec)
                MKNOD:               0
                UNLNK:        64958376 (+340350) (378.17/sec)
                RMDIR:         2028347 (+814) (0.90/sec)
                RENME:        17569210 (+3426) (3.81/sec)
                RNMTO:               0
                 OPEN:               0
                CLOSE:               0
                LYOUT:             553
                TRUNC:       179191635 (+33158) (36.84/sec)
                SATTR:       240069546 (+8296) (9.22/sec)
                XATTR:           29642
                  HSM:               0
                MTIME:           59454 (+1) (0.00/sec)
                CTIME:             371
                ATIME:               0

Storage usage has never been checked

No purge was performed on this filesystem


The filesystem in question is:

[root@sherwood ~]# rbh-report -i -f lfs3
Using config file '/etc/robinhood.d/tmpfs/lfs3.conf'.
type    ,      count,     volume,   avg_size
symlink ,    4819982,  399.90 MB,         87
dir     ,    5413496,   29.48 GB,    5.71 KB
file    ,  125761404,    1.04 PB,    8.92 MB
sock    ,          2,          0,          0

Total: 135994884 entries, 1175890771195111 bytes (1.04 PB)

[root@sherwood ~]# df -h /mnt/lfs3
Filesystem            Size  Used Avail Use% Mounted on
10.179.16.132@o2ib:10.179.16.131@o2ib:/lfs3
                      2.1P  1.1P  970T  53% /mnt/lfs3


Some of our current tuning attempts, based on previous recommendations on
this list, are:

* sysctl vm.vfs_cache_pressure = 150

* MySQL config (/etc/my.cnf):
    innodb_thread_concurrency=32
    innodb_max_dirty_pages_pct=15
    max_connections=256
    query_cache_size=8M
    thread_cache_size=4
    innodb_buffer_pool_size=240G (~85% of actual memory)

* Robinhood config:
    batch_ack_count = 10000 ;
    force_polling    = ON ;
    polling_interval = 1s ;
    queue_max_size   = 1000 ;   # (could be 2x larger?)
    queue_max_age    = 5s ;       # (could be 2x larger?)
    queue_check_interval = 1s ;

* Lustre parameters:
    llite.*.statahead_max=4
    mdc.*.max_rpcs_in_flight=64
    ldlm.namespaces.*osc*.lru_max_age=1200
    ldlm.namespaces.*osc*.lru_size=100

* Dropping inode cache daily, with:
    echo 2 > /proc/sys/vm/drop_caches   # (could be hourly?)
    # (would it be better to run "lru_size=clear" hourly?)

* Tuning module "ko2iblnd peer_credits" has NOT yet been attempted.
(unclear on what a good value is)
    [root@lfs-mds-3-1 ~]# cat /proc/sys/lnet/peers | egrep
"^(nid|10.179.17.41@o2ib)"
    nid                      refs state  last   max   rtr   min    tx   min
queue
    10.179.17.41@o2ib           1    NA    -1     8     8     8     8   -91
0
    [root@sherwood ~]# cat /proc/sys/lnet/peers | egrep
"^(nid|10.179.16.131@o2ib)"
    nid                      refs state  last   max   rtr   min    tx   min
queue
    10.179.16.131@o2ib          1    NA    -1     8     8     8     8 -1188
0


Thanks for any suggestions!!!

-Nathan

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140

_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

[robinhood-support] How to improve changelog processing rates

Reply via email to