i added the rfc822 headers to my sleepycat index files, which had the following effects. building indexes for my ~10GByte mail store takes ~90 times longer (one hour 45 minutes vs. one minute 11 seconds). the indexes are ~20 times larger. a modified version of scan, though it can no longer add the first few words of the body to each scan line, is 2..4X faster. that's not good enough to justify the complexity, but i'll work on it more before i decide whether it's the best we're going to get. i can probably add the first ~100 characters of the body into the database element, to make sure scan can print the first few words of it as now. here are the current results.
#nsa:amd64# folder +inbox inbox+ has 3269 messages (1-6144); cur=6134; (others). #nsa:amd64# repeat 3 time sh -c "scan +inbox > /tmp/vix1" 0.237u 0.728s 0:26.27 3.6% 132+1395k 0+1io 0pf+0w 0.165u 0.675s 0:35.35 2.3% 153+1598k 0+1io 0pf+0w 0.309u 0.813s 0:43.99 2.5% 140+1462k 0+1io 0pf+0w #nsa:amd64# ktrace scan +inbox > /dev/null #nsa:amd64# kdump -s | wc -l 46755 #nsa:amd64# kdump -s | grep -c NAMI 3375 #nsa:amd64# repeat 3 time sh -c "./scan +inbox > /tmp/vix2" 0.253u 0.784s 0:13.33 7.7% 134+1271k 0+1io 0pf+0w 0.311u 0.775s 0:13.76 7.8% 129+1251k 0+1io 0pf+0w 0.253u 0.642s 0:13.86 6.4% 145+1402k 0+1io 0pf+0w #nsa:amd64# ktrace ./scan +inbox > /dev/null #nsa:amd64# kdump -s | wc -l 75144 #nsa:amd64# kdump -s | grep -c NAMI 95 so, sleepycat is burning me somewhere, so while i'm avoiding open() i'm paying too high a cost elsewhere, probably in unnecessary concurrency control. (unnec'y since i'm holding a file-level lock throughout.) more later. if i get far enough i'll want to do a similar test with pick. noting, the software architecture of MH deeply presumes on the file store, everywhere i look i find fopen() and similar calls. this means to get MH to read the headers out of a database i had to temporarily use a nonportable freebsd library call "funopen" that lets me attach a "FILE *" to something that is not a file. i won't be offering patches that work that way, i'll have to rototill the internal interfaces to separate out the "FILE *" logic from the rest. that's a lot of work. i won't do it unless i get 10X or better performance from "scan" and "pick" in my (nonportable) testing. the test system uses freebsd 7 and ZFS "raidz2" if anybody cares. _______________________________________________ Nmh-workers mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/nmh-workers
