Re: Pop optimisation
Bron, Ken, did you have time to take a look at this patch ? By the way, here is an updated patch. -- Cyril Servant 2009/11/19 Servant Cyril cyril.serv...@atosorigin.com: Hello, Here is a patch for optimizing pop. Let me explain : Here we have lots (millions) of mailboxes. Many people connect to pop every few minutes, doing LIST, and if there are mails, they do RETR and DELE. Most of time, there is no mail (for 138770 pop connections, there was no mail 92498 times = 66.6%). Without the patch, the seen, index, cache and header files are opened. With this patch, we only read statuscache.db (which is already opened) when there is no mail. On the stat image joined, you can see what's happening when we empty statuscache.db (at 10:51) : pop optimization doesn't work the first time a client connects to pop (Lots of reads), and then, as the same clients connect again to pop, reads slowly decrease. Without the patch, reads would stay high. -- Cyril Servant Ce message et les pi?ces jointes sont confidentiels et r?serv?s ? l'usage exclusif de ses destinataires. Il peut ?galement ?tre prot?g? par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir imm?diatement l'exp?diteur et de le d?truire. L'int?grit? du message ne pouvant ?tre assur?e sur Internet, la responsabilit? du groupe Atos Origin ne pourra ?tre recherch?e quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'exp?diteur ne donne aucune garantie ? cet ?gard et sa responsabilit? ne saurait ?tre recherch?e pour tout dommage r?sultant d'un virus transmis. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Origin group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. -- Cyril diff -u -r cyrus-imapd-2.3.15.orig/imap/index.c cyrus-imapd-2.3.15/imap/index.c --- cyrus-imapd-2.3.15.orig/imap/index.c 2009-09-09 03:22:38.0 +0200 +++ cyrus-imapd-2.3.15/imap/index.c 2009-11-19 11:56:57.0 +0100 @@ -5511,3 +5511,116 @@ l = n; } } + +int index_statuscache(char *mboxname, char *name, struct auth_state *authstate, unsigned statusitems, struct statuscache_data *scdata) +{ +int r; +struct mailbox mailbox; +int doclose = 0; +int num_recent = 0; +int num_unseen = 0; +int sepchar; +static struct seq_set seq_set = { NULL, 0, 0, 0 , NULL}; + +/* Check status cache if possible */ +if (config_getswitch(IMAPOPT_STATUSCACHE)) { + /* Do actual lookup of cache item. */ + r = statuscache_lookup(mboxname, name, statusitems, scdata); + + /* Seen/recent status uses push invalidation events from + * seen_db.c. This avoids needing to open cyrus.header to get + * the mailbox uniqueid to open the seen db and get the + * unseen_mtime and recentuid. */ + + if (!r) { + syslog(LOG_DEBUG, statuscache, '%s', '%s', '0x%02x', 'yes', + mboxname, name, statusitems); + goto statusdone; + } + + syslog(LOG_DEBUG, statuscache, '%s', '%s', '0x%02x', 'no', + mboxname, name, statusitems); +} + +/* Missing or invalid cache entry */ +r = mailbox_open_header(mboxname, authstate, mailbox); + +if (!r) { + doclose = 1; + r = mailbox_open_index(mailbox); +} + +if (!r mailbox.exists != 0 + (statusitems (STATUS_RECENT | STATUS_UNSEEN))) { + /* Read \Seen state */ + struct seen *status_seendb; + time_t last_read, last_change = 0; + unsigned last_uid; + char *last_seenuids; + + r = seen_open(mailbox, + (mailbox.options OPT_IMAP_SHAREDSEEN) ? anyone : + name, + SEEN_CREATE, status_seendb); + + if (!r) { + r = seen_lockread(status_seendb, last_read, last_uid, + last_change, last_seenuids); + seen_close(status_seendb); + } + + if (!r) { + const char *base; + unsigned long len = 0; + unsigned msg, uid; + + map_refresh(mailbox.index_fd, 0, base, len, + mailbox.start_offset + + mailbox.exists * mailbox.record_size, + index, mailbox.name); + + seq_set.len = seq_set.mark = 0; + index_parse_sequence(last_seenuids, 0, seq_set); + + for (msg = 0; msg mailbox.exists; msg++) { + uid = ntohl(*((bit32 *)(base + mailbox.start_offset + +msg * mailbox.record_size + +OFFSET_UID))); + /* Always calculate num_recent, + * even if only need num_unseen... for caching below */ + if (uid last_uid) num_recent++; + if ((statusitems STATUS_UNSEEN) + !index_insequence(uid, seq_set, 1)) num_unseen++; + /* NB: The value of the third argument to
Re: Pop optimisation
On Thu, Nov 26, 2009 at 10:38:05AM +0100, Cyril Servant wrote: Bron, Ken, did you have time to take a look at this patch ? Sorry - I've been pretty crazy all week getting our new database servers up and running - been needing new hardware for a while. I'll make some time to go back through the archives tomorrow and do some more Cyrus work. Bron ( P.S. it's worth CCing Ken explicitly when you have a question for him, he doesn't always see stuff on the lists )
Pop optimisation
Hello, Here is a patch for optimizing pop. Let me explain : Here we have lots (millions) of mailboxes. Many people connect to pop every few minutes, doing LIST, and if there are mails, they do RETR and DELE. Most of time, there is no mail (for 138770 pop connections, there was no mail 92498 times = 66.6%). Without the patch, the seen, index, cache and header files are opened. With this patch, we only read statuscache.db (which is already opened) when there is no mail. On the stat image joined, you can see what's happening when we empty statuscache.db (at 10:51) : pop optimization doesn't work the first time a client connects to pop (Lots of reads), and then, as the same clients connect again to pop, reads slowly decrease. Without the patch, reads would stay high. -- Cyril Servant Ce message et les pi?ces jointes sont confidentiels et r?serv?s ? l'usage exclusif de ses destinataires. Il peut ?galement ?tre prot?g? par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir imm?diatement l'exp?diteur et de le d?truire. L'int?grit? du message ne pouvant ?tre assur?e sur Internet, la responsabilit? du groupe Atos Origin ne pourra ?tre recherch?e quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'exp?diteur ne donne aucune garantie ? cet ?gard et sa responsabilit? ne saurait ?tre recherch?e pour tout dommage r?sultant d'un virus transmis. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Origin group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. diff -u -r cyrus-imapd-2.3.15.orig/imap/index.c cyrus-imapd-2.3.15/imap/index.c --- cyrus-imapd-2.3.15.orig/imap/index.c 2009-09-09 03:22:38.0 +0200 +++ cyrus-imapd-2.3.15/imap/index.c 2009-11-19 11:56:57.0 +0100 @@ -5511,3 +5511,116 @@ l = n; } } + +int index_statuscache(char *mboxname, char *name, struct auth_state *authstate, unsigned statusitems, struct statuscache_data *scdata) +{ +int r; +struct mailbox mailbox; +int doclose = 0; +int num_recent = 0; +int num_unseen = 0; +int sepchar; +static struct seq_set seq_set = { NULL, 0, 0, 0 , NULL}; + +/* Check status cache if possible */ +if (config_getswitch(IMAPOPT_STATUSCACHE)) { + /* Do actual lookup of cache item. */ + r = statuscache_lookup(mboxname, name, statusitems, scdata); + + /* Seen/recent status uses push invalidation events from + * seen_db.c. This avoids needing to open cyrus.header to get + * the mailbox uniqueid to open the seen db and get the + * unseen_mtime and recentuid. */ + + if (!r) { + syslog(LOG_DEBUG, statuscache, '%s', '%s', '0x%02x', 'yes', + mboxname, name, statusitems); + goto statusdone; + } + + syslog(LOG_DEBUG, statuscache, '%s', '%s', '0x%02x', 'no', + mboxname, name, statusitems); +} + +/* Missing or invalid cache entry */ +r = mailbox_open_header(mboxname, authstate, mailbox); + +if (!r) { + doclose = 1; + r = mailbox_open_index(mailbox); +} + +if (!r mailbox.exists != 0 + (statusitems (STATUS_RECENT | STATUS_UNSEEN))) { + /* Read \Seen state */ + struct seen *status_seendb; + time_t last_read, last_change = 0; + unsigned last_uid; + char *last_seenuids; + + r = seen_open(mailbox, + (mailbox.options OPT_IMAP_SHAREDSEEN) ? anyone : + name, + SEEN_CREATE, status_seendb); + + if (!r) { + r = seen_lockread(status_seendb, last_read, last_uid, + last_change, last_seenuids); + seen_close(status_seendb); + } + + if (!r) { + const char *base; + unsigned long len = 0; + unsigned msg, uid; + + map_refresh(mailbox.index_fd, 0, base, len, + mailbox.start_offset + + mailbox.exists * mailbox.record_size, + index, mailbox.name); + + seq_set.len = seq_set.mark = 0; + index_parse_sequence(last_seenuids, 0, seq_set); + + for (msg = 0; msg mailbox.exists; msg++) { + uid = ntohl(*((bit32 *)(base + mailbox.start_offset + +msg * mailbox.record_size + +OFFSET_UID))); + /* Always calculate num_recent, + * even if only need num_unseen... for caching below */ + if (uid last_uid) num_recent++; + if ((statusitems STATUS_UNSEEN) + !index_insequence(uid, seq_set, 1)) num_unseen++; + /* NB: The value of the third argument to index_insequence() + * above does not matter. */ + } + map_free(base, len); + free(last_seenuids); + } +} + +if (!r) { + /* We always have message count, uidnext, + * uidvalidity, and highestmodseq for