Re: Pop optimisation

2009-11-26 Thread Cyril Servant
Bron, Ken, did you have time to take a look at this patch ?

By the way, here is an updated patch.
-- 
Cyril Servant

2009/11/19 Servant Cyril cyril.serv...@atosorigin.com:
 Hello,

 Here is a patch for optimizing pop. Let me explain : Here we have lots 
 (millions) of mailboxes. Many people connect to pop every few minutes, doing 
 LIST, and if there are mails, they do RETR and DELE. Most of time, there is 
 no mail (for 138770 pop connections, there was no mail 92498 times = 66.6%). 
 Without the patch, the seen, index, cache and header files are opened. With 
 this patch, we only read statuscache.db (which is already opened) when there 
 is no mail.

 On the stat image joined, you can see what's happening when we empty 
 statuscache.db (at 10:51) : pop optimization doesn't work the first time a 
 client connects to pop (Lots of reads), and then, as the same clients connect 
 again to pop, reads slowly decrease. Without the patch, reads would stay high.

 --
 Cyril Servant


 Ce message et les pi?ces jointes sont confidentiels et r?serv?s ? l'usage 
 exclusif de ses destinataires. Il peut ?galement ?tre prot?g? par le secret 
 professionnel. Si vous recevez ce message par erreur, merci d'en avertir 
 imm?diatement l'exp?diteur et de le d?truire. L'int?grit? du message ne 
 pouvant ?tre assur?e sur Internet, la responsabilit? du groupe Atos Origin ne 
 pourra ?tre recherch?e quant au contenu de ce message. Bien que les meilleurs 
 efforts soient faits pour maintenir cette transmission exempte de tout virus, 
 l'exp?diteur ne donne aucune garantie ? cet ?gard et sa responsabilit? ne 
 saurait ?tre recherch?e pour tout dommage r?sultant d'un virus transmis.

 This e-mail and the documents attached are confidential and intended solely 
 for the addressee; it may also be privileged. If you receive this e-mail in 
 error, please notify the sender immediately and destroy it. As its integrity 
 cannot be secured on the Internet, the Atos Origin group liability cannot be 
 triggered for the message content. Although the sender endeavours to maintain 
 a computer virus-free network, the sender does not warrant that this 
 transmission is virus-free and will not be liable for any damages resulting 
 from any virus transmitted.




-- 
Cyril
diff -u -r cyrus-imapd-2.3.15.orig/imap/index.c cyrus-imapd-2.3.15/imap/index.c
--- cyrus-imapd-2.3.15.orig/imap/index.c	2009-09-09 03:22:38.0 +0200
+++ cyrus-imapd-2.3.15/imap/index.c	2009-11-19 11:56:57.0 +0100
@@ -5511,3 +5511,116 @@
 	l = n;
 }
 }
+
+int index_statuscache(char *mboxname, char *name, struct auth_state *authstate, unsigned statusitems, struct statuscache_data *scdata)
+{
+int r;
+struct mailbox mailbox;
+int doclose = 0;
+int num_recent = 0;
+int num_unseen = 0;
+int sepchar;
+static struct seq_set seq_set = { NULL, 0, 0, 0 , NULL};
+
+/* Check status cache if possible */
+if (config_getswitch(IMAPOPT_STATUSCACHE)) {
+	/* Do actual lookup of cache item. */
+	r = statuscache_lookup(mboxname, name, statusitems, scdata);
+
+	/* Seen/recent status uses push invalidation events from
+	 * seen_db.c.   This avoids needing to open cyrus.header to get
+	 * the mailbox uniqueid to open the seen db and get the
+	 * unseen_mtime and recentuid. */
+
+	if (!r) {
+	syslog(LOG_DEBUG, statuscache, '%s', '%s', '0x%02x', 'yes',
+		mboxname, name, statusitems);
+	goto statusdone;
+	}
+
+	syslog(LOG_DEBUG, statuscache, '%s', '%s', '0x%02x', 'no',
+		mboxname, name, statusitems);
+}
+
+/* Missing or invalid cache entry */
+r = mailbox_open_header(mboxname, authstate, mailbox);
+
+if (!r) {
+	doclose = 1;
+	r = mailbox_open_index(mailbox);
+}
+
+if (!r  mailbox.exists != 0 
+	(statusitems  (STATUS_RECENT | STATUS_UNSEEN))) {
+	/* Read \Seen state */
+	struct seen *status_seendb;
+	time_t last_read, last_change = 0;
+	unsigned last_uid;
+	char *last_seenuids;
+
+	r = seen_open(mailbox,
+		(mailbox.options  OPT_IMAP_SHAREDSEEN) ? anyone :
+		name,
+		SEEN_CREATE, status_seendb);
+
+	if (!r) {
+	r = seen_lockread(status_seendb, last_read, last_uid,
+		last_change, last_seenuids);
+	seen_close(status_seendb);
+	}
+
+	if (!r) {
+	const char *base;
+	unsigned long len = 0;
+	unsigned msg, uid;
+
+	map_refresh(mailbox.index_fd, 0, base, len,
+		mailbox.start_offset +
+		mailbox.exists * mailbox.record_size,
+		index, mailbox.name);
+
+	seq_set.len = seq_set.mark = 0;
+	index_parse_sequence(last_seenuids, 0, seq_set);
+
+	for (msg = 0; msg  mailbox.exists; msg++) {
+		uid = ntohl(*((bit32 *)(base + mailbox.start_offset +
+msg * mailbox.record_size +
+OFFSET_UID)));
+		/* Always calculate num_recent,
+		 * even if only need num_unseen... for caching below */
+		if (uid  last_uid) num_recent++;
+		if ((statusitems  STATUS_UNSEEN) 
+			!index_insequence(uid, seq_set, 1)) num_unseen++;
+		/* NB: The value of the third argument to 

Re: Pop optimisation

2009-11-26 Thread Bron Gondwana
On Thu, Nov 26, 2009 at 10:38:05AM +0100, Cyril Servant wrote:
 Bron, Ken, did you have time to take a look at this patch ?

Sorry - I've been pretty crazy all week getting our new database
servers up and running - been needing new hardware for a while.

I'll make some time to go back through the archives tomorrow and
do some more Cyrus work.

Bron ( P.S. it's worth CCing Ken explicitly when you have a question
   for him, he doesn't always see stuff on the lists )