Hi Olly, how's going? I hope that the trip back from the GSoC meeting
found you well (whereas I'm sick again :-/).

I'm writing to you to ask for a Xapian-related insight on #535162, which
I'm Cc-ing. Basically, maildir-utils uses Xapian to index mails (and
SQLite to store their metadata, but that's mostly unrelated).

When deleting a lot of messages (but not _that_ many, like hundreds over
tens of thousands) database cleanup takes a lot, way more than
re-indexing everything from scratch. Studying a bit the code, we *think*
the issue is that Xapian cleanup is done with a transaction per
deletion, instead of a single big transaction. That, apparently and
reasonably, causes a lot of I/O. A related code snippet is at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=535162#32 .

Since doing a single transaction will require changing the callback
structure in maildir-utils, I would very welcome a comment by some
Xapian-expert (I've never programmed using it as a library). Do you
think that the performance issue can be solved by doing one big
transaction? or else you believe something else is going on? (e.g.,
should we try to do a delete of several messages as once, if that's
supported by the Xapian API?).

Many thanks in advance!
Cheers and take care.

-- 
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
z...@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime



--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to