Hi! As I already told on IRC (and which I still have to polish and publish...), I recently merged Austin's custom query parser into my local tree, mainly (for now) for its exact folder/directory searching capabilities.
Austin had published this work several months ago, and Carl in the mean time had implemented his own folder: searches. Now, there was a conflict about which to use; they have different semantics, Carl's being inadequate for my use case (not rooted, for example). On IRC, Carl recently had the most pragmatic solution for how to approach this: if we can't agree on having either his folder: semantics, or Austin's strict filename matching -- then just have both of them. So I now have arranged for having both Carl's folder: (with it's ``weak'' mail folder semantics), and also Austin's directory: (with it's ``hard'' directory/filename matching semantics), and on top of the latter implemented rdirectory: which extends directory: by recursive matching. This works really nice. IRC, freenode, #notmuch, 2011-09-30: <amdragon> tschwinge: Before you get in too deep I should point out that there's a (not unsurmountable) flaw in the folder handling. Because it expands to all of the desired dir-entry terms, it can chew up a huge amount of memory (~50K per matched file, IIRC). After importing several GNU mailing lists' archives yesterday, I now did some measurements, and it is in the 20s KiB per file, ranging from 26 KiB for a 9000 files hierarchy to 21 KiB for a 23000 files hierarchy (the reason for the non-linearity mostly being notmuch's regular resident size, etc., I assume). And, of course: $ find ~/Mail-schwinge.name-thomas/import/GNU/2011-04-03/ -type f | wc -l 276010 $ notmuch search --output=files -- rdirectory:import/GNU/2011-04-03 | grep -F import/GNU/2011-04-03 | wc -l 0 $ echo "${PIPESTATUS[@]}" 137 1 0 $ dmesg | grep notmuch [3797089.224252] notmuch invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0 [3797089.224282] notmuch cpuset=/ mems_allowed=0 [3797089.224290] Pid: 586, comm: notmuch Not tainted 3.0.0-1-686-pae #1 [3797089.232081] [ 586] 1000 586 310693 257874 0 0 0 notmuch [3797089.232081] Out of memory: Kill process 586 (notmuch) score 697 or sacrifice child [3797089.232081] Killed process 586 (notmuch) total-vm:1242772kB, anon-rss:1031492kB, file-rss:4kB :-) (But this is no problem for me; I don't need to do such coarse-grained matching.) <amdragon> tschwinge: The solution is probably to add folder terms to messages (but as one, unsplit term, unlike in cworth's approach) and expand on those so that the space is bounded by the number of matched folders, rather than files. That would also make it quite easy to do arbitrary glob matching. (These would now be directory terms.) This suggestion still stands. (But I'm not working on it at the moment.) Grüße, Thomas
pgptbmAORgE44.pgp
Description: PGP signature
_______________________________________________ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch