Hi Olly, how's going? I hope that the trip back from the GSoC meeting found you well (whereas I'm sick again :-/).
I'm writing to you to ask for a Xapian-related insight on #535162, which I'm Cc-ing. Basically, maildir-utils uses Xapian to index mails (and SQLite to store their metadata, but that's mostly unrelated). When deleting a lot of messages (but not _that_ many, like hundreds over tens of thousands) database cleanup takes a lot, way more than re-indexing everything from scratch. Studying a bit the code, we *think* the issue is that Xapian cleanup is done with a transaction per deletion, instead of a single big transaction. That, apparently and reasonably, causes a lot of I/O. A related code snippet is at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=535162#32 . Since doing a single transaction will require changing the callback structure in maildir-utils, I would very welcome a comment by some Xapian-expert (I've never programmed using it as a library). Do you think that the performance issue can be solved by doing one big transaction? or else you believe something else is going on? (e.g., should we try to do a delete of several messages as once, if that's supported by the Xapian API?). Many thanks in advance! Cheers and take care. -- Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7 z...@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/ Dietro un grande uomo c'è ..| . |. Et ne m'en veux pas si je te tutoie sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

