Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)

2016-05-18 Thread Eric Luyten via Info-cyrus
On Wed, May 18, 2016 1:54 am, Bron Gondwana via Info-cyrus wrote:
>

>> What we do at FastMail to make deliver.db not suck is store it on tmpfs.
>> The repack is tons faster.  Sure you lose it over a full server restart,
>> but all you lose is the duplicate suppression.  If you wanted to be really
>> clever about it, you could copy the file during the shutdown script and
>> maybe once per hour otherwise, and copy it back onto tmpfs during startup.
>>
>> duplicate_db_path: /var/run/cyrus/duplicate.db
>>
>
> oh right, 2.3.x doesn't have duplicate_db_path.
>
> I think your choices are either to hack that option into your codebase so
> that you can move the duplicate DB onto tmpfs, live with what you've got
> (possibly by putting /var/imap on fast disk/SSD), or upgrade to a Cyrus that
> isn't 10 years old!


Bron,


I have an upgrade to 2.5. on my plate.
The approximately-once-a-day deliveries freeze is not critical.


Regards,
Eric.



Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)

2016-05-17 Thread Bron Gondwana via Info-cyrus

> What we do at FastMail to make deliver.db not suck is store it on tmpfs.  The 
> repack is tons faster.  Sure you lose it over a full server restart, but all 
> you lose is the duplicate suppression.  If you wanted to be really clever 
> about it, you could copy the file during the shutdown script and maybe once 
> per hour otherwise, and copy it back onto tmpfs during startup.
> 
> duplicate_db_path: /var/run/cyrus/duplicate.db

oh right, 2.3.x doesn't have duplicate_db_path.

I think your choices are either to hack that option into your codebase so that 
you can move the duplicate DB onto tmpfs, live with what you've got (possibly 
by putting /var/imap on fast disk/SSD), or upgrade to a Cyrus that isn't 10 
years old!

Bron.

-- 
  Bron Gondwana
  br...@fastmail.fm

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)

2016-05-17 Thread Bron Gondwana via Info-cyrus
On Tue, May 17, 2016, at 22:51, Eric Luyten via Info-cyrus wrote:
> On Tue, May 17, 2016 11:45 am, Simon Matter wrote:
> >> Hi,
> >>
> >>
> >>
> >> Several times a month our server freezes up on deliveries and the system
> >> load average shoots up into the hundreds. Things quickly return to normal
> >> between one and two minutes later but this has always puzzled me.
> >>
> >> Today I was watching the system from up close when it happened.
> >>
> >>
> >>
> >> May 17 10:59:14  lmtp[24980]: skiplist: checkpointed
> >> /ssd/cyrs/imap/deliver.db (223062 records, 25295200 bytes) in 119 seconds
> >>
> >>
> >>
> >>
> >> I took a quick dive into the code but could not find where and when lmtpd
> >> is supposed to trigger a delivery.db checkpointing action.
> >
> > Isn't it controlled by 'checkpointcmd="ctl_cyrusdb -c" period=30' in
> > cyrus.conf?
> 
> 
> Okay, I think I found the code in   lib/cyrusdb_skiplist.c
> 
> We do indeed have the (default) 'checkpoint  cmd="ctl_cyrusdb -c" period=30'
> entry in cyrus.conf, 30 referring to the number of minutes between 
> invocations.
> 
> We prune deliver.db every night at 00:55 with -E 1
> 
> 
> So I guess the phenomenon I witnessed this morning correlates with server
> business in the area of deliveries.
> A Cyrus Wiki page hints at reducing the number of minutes down from 30.
> 
> "The most common one is that you need to checkpoint the cyrusdb more often.
>  This can be done with a simple ctl_cyrusdb -c If you do this very often,
>  the amount of log that needs to be recovered will be significantly shorter.
>  We recommend doing this at least once every half hour, and more often on
>  busy sites. "
> (http://cyrusimap.web.cmu.edu/mediawiki/index.php/FAQ)

Urgh: 2.3.x.

Sadly, that's not really hooked up nicely and the terminology is really muddy.
Skiplist databases will rewrite themselves as a more compact version when they
reach a certain ratio of ADD records to INORDER records.

This isn't exposed outside cyrusdb_skiplist.c until 2.5, and it's not hooked 
into
ctl_cyrusdb's "checkpoint" operation, which just calls a sync on each database
engine:

case CHECKPOINT:
r2 = (*(dblist[i].env))->sync();

and then takes a backup of the files with:
r2 = (*(dblist[i].env))->archive((const char**) archive_files,
 backup1);


sync does nothing:

static int mysync(void)
{
return 0;
}


archive takes copies of the files (without even locking!)

static int myarchive(const char **fnames, const char *dirname)
{
int r;
const char **fname;
char dstname[1024], *dp;
int length, rest;

strlcpy(dstname, dirname, sizeof(dstname));
length = strlen(dstname);
dp = dstname + length;
rest = sizeof(dstname) - length;

/* archive those files specified by the app */
for (fname = fnames; *fname != NULL; ++fname) {
syslog(LOG_DEBUG, "archiving database file: %s", *fname);
strlcpy(dp, strrchr(*fname, '/'), rest);
r = cyrusdb_copyfile(*fname, dstname);
if (r) {
syslog(LOG_ERR,
   "DBERROR: error archiving database file: %s", *fname);
return CYRUSDB_IOERROR;
}
}

return 0;
}

...

These are identical right up to 3.0, though they're factored out into
"generic sync" and "generic archive".  So ctl_cyrusdb checkpoint
doesn't actually do much worthwhile work.

At least in 3.0 you can use cyr_dbtool to checkpoint a database
explicitly if you want to:

sudo -u cyrus cyr_dbtool /var/imap/deliver.db skiplist repack

But you're running 2.3.x, so none of my last 6 years of work are
available to you!

---

What we do at FastMail to make deliver.db not suck is store it on tmpfs.  The 
repack is tons faster.  Sure you lose it over a full server restart, but all 
you lose is the duplicate suppression.  If you wanted to be really clever about 
it, you could copy the file during the shutdown script and maybe once per hour 
otherwise, and copy it back onto tmpfs during startup.

duplicate_db_path: /var/run/cyrus/duplicate.db

(where /var/run is a tmpfs on our systems)

Bron.

-- 
  Bron Gondwana
  br...@fastmail.fm

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)

2016-05-17 Thread Eric Luyten via Info-cyrus
On Tue, May 17, 2016 11:45 am, Simon Matter wrote:
>> Hi,
>>
>>
>>
>> Several times a month our server freezes up on deliveries and the system
>> load average shoots up into the hundreds. Things quickly return to normal
>> between one and two minutes later but this has always puzzled me.
>>
>> Today I was watching the system from up close when it happened.
>>
>>
>>
>> May 17 10:59:14  lmtp[24980]: skiplist: checkpointed
>> /ssd/cyrs/imap/deliver.db (223062 records, 25295200 bytes) in 119 seconds
>>
>>
>>
>>
>> I took a quick dive into the code but could not find where and when lmtpd
>> is supposed to trigger a delivery.db checkpointing action.
>
> Isn't it controlled by 'checkpointcmd="ctl_cyrusdb -c" period=30' in
> cyrus.conf?


Okay, I think I found the code in   lib/cyrusdb_skiplist.c

We do indeed have the (default) 'checkpoint  cmd="ctl_cyrusdb -c" period=30'
entry in cyrus.conf, 30 referring to the number of minutes between invocations.

We prune deliver.db every night at 00:55 with -E 1


So I guess the phenomenon I witnessed this morning correlates with server
business in the area of deliveries.
A Cyrus Wiki page hints at reducing the number of minutes down from 30.

"The most common one is that you need to checkpoint the cyrusdb more often.
 This can be done with a simple ctl_cyrusdb -c If you do this very often,
 the amount of log that needs to be recovered will be significantly shorter.
 We recommend doing this at least once every half hour, and more often on
 busy sites. "
(http://cyrusimap.web.cmu.edu/mediawiki/index.php/FAQ)


Eric.



Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: lmtpd triggering a delivery.db checkpointing (Cyrus 2.3.16)

2016-05-17 Thread Simon Matter via Info-cyrus
> Hi,
>
>
> Several times a month our server freezes up on deliveries and the system
> load average shoots up into the hundreds. Things quickly return to normal
> between one and two minutes later but this has always puzzled me.
>
> Today I was watching the system from up close when it happened.
>
>
> May 17 10:59:14  lmtp[24980]: skiplist: checkpointed
> /ssd/cyrs/imap/deliver.db (223062 records, 25295200 bytes) in 119 seconds
>
>
>
> I took a quick dive into the code but could not find where and when lmtpd
> is supposed to trigger a delivery.db checkpointing action.

Isn't it controlled by 'checkpointcmd="ctl_cyrusdb -c" period=30' in
cyrus.conf?

Simon


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus