'Lo. I've been toying with switching to notmuch for a while, but the speed of search has been holding me back. Perhaps I should upgrade the archaic machine I'm trying to run notmuch on, but I figured optimizing search would be a good way for me to cut my teeth on notmuch's code. I've implemented two general optimizations that, together, make thread search 2.5X faster, reducing my test "tag:inbox" query from 4.523 seconds to 1.779 seconds.
Before cleaning up and posting the patches, I wanted to make sure my general approach is both correct and pedagogically sound. I reduced thread search's 1 + 2t Xapian queries (where t is the number of unique matched threads) to 1 + t queries and now construct exactly one notmuch_message_t for each message instead of 2 to 3. The query performed by notmuch_query_search_threads to get all messages matching the user's query now fetches only the docid set instead of constructing message objects and decompressing the thread ID of every message from its term list. Threads are then constructed from these docids using "thread:" queries much as before; however, now which subset of these messages matched the user query is determined by checking their docids against the docid set constructed earlier, rather than requiring yet another Xapian query per thread to find matched messages. Before this optimization, my test "tag:inbox" query took 4.523 seconds. With the optimization, it takes 3.072 seconds (1.5X faster). It has the downside that it requires more RAM, though even with, say, a 1 million message database, it's at most ~4.2 megs. In principle it also introduces latency before the first search result, but fetching docid sets is what Xapian was born to do and I haven't noticed any latency. The second optimization is based on the observation that Xapian decompresses a document's term list every time you iterate over it. As a result, notmuch can decompress the beginning of a single term list at least five times. I combined all of the message metadata fetching (message ID, thread ID, reply-to, filename list, and tag list) into a single pass over the term list that fetches everything. This optimization reduces my "tag:inbox" query to 3.521 seconds (1.3X faster). This one is a bit more of a balancing act, since it may fetch metadata that is never needed; however, doing the single pass takes only a little longer than running any of the individual metadata requests. These two optimizations complement each other. With both, my query takes 1.779 seconds (2.5X faster). Because the first constructs only one message object per message, it aggregates all metadata operations on that one object instead of spreading them across multiple objects, increasing the effectiveness of single-pass metadata retrieval. Does this all seem reasonable? My code passes the test suite [1], so I believe it to be fairly sound. [1] Except for 2 emacs tests that depend on author order. What order are matched authors *supposed* to be in? -- Austin Clements MIT/'06/PhD/CSAIL amdragon at mit.edu http://web.mit.edu/amdragon Somewhere in the dream we call reality you will find me, searching for the reality we call dreams.