Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)
On Wed, May 18, 2016 1:54 am, Bron Gondwana via Info-cyrus wrote: > >> What we do at FastMail to make deliver.db not suck is store it on tmpfs. >> The repack is tons faster. Sure you lose it over a full server restart, >> but all you lose is the duplicate suppression. If you wanted to be really >> clever about it, you could copy the file during the shutdown script and >> maybe once per hour otherwise, and copy it back onto tmpfs during startup. >> >> duplicate_db_path: /var/run/cyrus/duplicate.db >> > > oh right, 2.3.x doesn't have duplicate_db_path. > > I think your choices are either to hack that option into your codebase so > that you can move the duplicate DB onto tmpfs, live with what you've got > (possibly by putting /var/imap on fast disk/SSD), or upgrade to a Cyrus that > isn't 10 years old! Bron, I have an upgrade to 2.5. on my plate. The approximately-once-a-day deliveries freeze is not critical. Regards, Eric. Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)
> What we do at FastMail to make deliver.db not suck is store it on tmpfs. The > repack is tons faster. Sure you lose it over a full server restart, but all > you lose is the duplicate suppression. If you wanted to be really clever > about it, you could copy the file during the shutdown script and maybe once > per hour otherwise, and copy it back onto tmpfs during startup. > > duplicate_db_path: /var/run/cyrus/duplicate.db oh right, 2.3.x doesn't have duplicate_db_path. I think your choices are either to hack that option into your codebase so that you can move the duplicate DB onto tmpfs, live with what you've got (possibly by putting /var/imap on fast disk/SSD), or upgrade to a Cyrus that isn't 10 years old! Bron. -- Bron Gondwana br...@fastmail.fm Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)
On Tue, May 17, 2016, at 22:51, Eric Luyten via Info-cyrus wrote: > On Tue, May 17, 2016 11:45 am, Simon Matter wrote: > >> Hi, > >> > >> > >> > >> Several times a month our server freezes up on deliveries and the system > >> load average shoots up into the hundreds. Things quickly return to normal > >> between one and two minutes later but this has always puzzled me. > >> > >> Today I was watching the system from up close when it happened. > >> > >> > >> > >> May 17 10:59:14 lmtp[24980]: skiplist: checkpointed > >> /ssd/cyrs/imap/deliver.db (223062 records, 25295200 bytes) in 119 seconds > >> > >> > >> > >> > >> I took a quick dive into the code but could not find where and when lmtpd > >> is supposed to trigger a delivery.db checkpointing action. > > > > Isn't it controlled by 'checkpointcmd="ctl_cyrusdb -c" period=30' in > > cyrus.conf? > > > Okay, I think I found the code in lib/cyrusdb_skiplist.c > > We do indeed have the (default) 'checkpoint cmd="ctl_cyrusdb -c" period=30' > entry in cyrus.conf, 30 referring to the number of minutes between > invocations. > > We prune deliver.db every night at 00:55 with -E 1 > > > So I guess the phenomenon I witnessed this morning correlates with server > business in the area of deliveries. > A Cyrus Wiki page hints at reducing the number of minutes down from 30. > > "The most common one is that you need to checkpoint the cyrusdb more often. > This can be done with a simple ctl_cyrusdb -c If you do this very often, > the amount of log that needs to be recovered will be significantly shorter. > We recommend doing this at least once every half hour, and more often on > busy sites. " > (http://cyrusimap.web.cmu.edu/mediawiki/index.php/FAQ) Urgh: 2.3.x. Sadly, that's not really hooked up nicely and the terminology is really muddy. Skiplist databases will rewrite themselves as a more compact version when they reach a certain ratio of ADD records to INORDER records. This isn't exposed outside cyrusdb_skiplist.c until 2.5, and it's not hooked into ctl_cyrusdb's "checkpoint" operation, which just calls a sync on each database engine: case CHECKPOINT: r2 = (*(dblist[i].env))->sync(); and then takes a backup of the files with: r2 = (*(dblist[i].env))->archive((const char**) archive_files, backup1); sync does nothing: static int mysync(void) { return 0; } archive takes copies of the files (without even locking!) static int myarchive(const char **fnames, const char *dirname) { int r; const char **fname; char dstname[1024], *dp; int length, rest; strlcpy(dstname, dirname, sizeof(dstname)); length = strlen(dstname); dp = dstname + length; rest = sizeof(dstname) - length; /* archive those files specified by the app */ for (fname = fnames; *fname != NULL; ++fname) { syslog(LOG_DEBUG, "archiving database file: %s", *fname); strlcpy(dp, strrchr(*fname, '/'), rest); r = cyrusdb_copyfile(*fname, dstname); if (r) { syslog(LOG_ERR, "DBERROR: error archiving database file: %s", *fname); return CYRUSDB_IOERROR; } } return 0; } ... These are identical right up to 3.0, though they're factored out into "generic sync" and "generic archive". So ctl_cyrusdb checkpoint doesn't actually do much worthwhile work. At least in 3.0 you can use cyr_dbtool to checkpoint a database explicitly if you want to: sudo -u cyrus cyr_dbtool /var/imap/deliver.db skiplist repack But you're running 2.3.x, so none of my last 6 years of work are available to you! --- What we do at FastMail to make deliver.db not suck is store it on tmpfs. The repack is tons faster. Sure you lose it over a full server restart, but all you lose is the duplicate suppression. If you wanted to be really clever about it, you could copy the file during the shutdown script and maybe once per hour otherwise, and copy it back onto tmpfs during startup. duplicate_db_path: /var/run/cyrus/duplicate.db (where /var/run is a tmpfs on our systems) Bron. -- Bron Gondwana br...@fastmail.fm Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)
On Tue, May 17, 2016 11:45 am, Simon Matter wrote: >> Hi, >> >> >> >> Several times a month our server freezes up on deliveries and the system >> load average shoots up into the hundreds. Things quickly return to normal >> between one and two minutes later but this has always puzzled me. >> >> Today I was watching the system from up close when it happened. >> >> >> >> May 17 10:59:14 lmtp[24980]: skiplist: checkpointed >> /ssd/cyrs/imap/deliver.db (223062 records, 25295200 bytes) in 119 seconds >> >> >> >> >> I took a quick dive into the code but could not find where and when lmtpd >> is supposed to trigger a delivery.db checkpointing action. > > Isn't it controlled by 'checkpointcmd="ctl_cyrusdb -c" period=30' in > cyrus.conf? Okay, I think I found the code in lib/cyrusdb_skiplist.c We do indeed have the (default) 'checkpoint cmd="ctl_cyrusdb -c" period=30' entry in cyrus.conf, 30 referring to the number of minutes between invocations. We prune deliver.db every night at 00:55 with -E 1 So I guess the phenomenon I witnessed this morning correlates with server business in the area of deliveries. A Cyrus Wiki page hints at reducing the number of minutes down from 30. "The most common one is that you need to checkpoint the cyrusdb more often. This can be done with a simple ctl_cyrusdb -c If you do this very often, the amount of log that needs to be recovered will be significantly shorter. We recommend doing this at least once every half hour, and more often on busy sites. " (http://cyrusimap.web.cmu.edu/mediawiki/index.php/FAQ) Eric. Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)
> Hi, > > > Several times a month our server freezes up on deliveries and the system > load average shoots up into the hundreds. Things quickly return to normal > between one and two minutes later but this has always puzzled me. > > Today I was watching the system from up close when it happened. > > > May 17 10:59:14 lmtp[24980]: skiplist: checkpointed > /ssd/cyrs/imap/deliver.db (223062 records, 25295200 bytes) in 119 seconds > > > > I took a quick dive into the code but could not find where and when lmtpd > is supposed to trigger a delivery.db checkpointing action. Isn't it controlled by 'checkpointcmd="ctl_cyrusdb -c" period=30' in cyrus.conf? Simon Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus