I thought I would sit back and wait to see where this went. I was surprised and disappointed no one brought up the huge performance cost of the current implementation.
I guess no one has network-mounted home directories anymore. I don't. But for many years that was where my mail lived. And lots of it! If you review my messages to this list over the years, you'll notice a recurring pattern about optimizing away those slow directory reads. You see, nmh's dirty secret (ok ok, one of many!) is that the first thing every command does is read the entire directory. Yep, the whole thing. So, the `new` commands that I added carefully avoided the readdir. But combining scan and pick on NFS is even worse than that though, as both have to open message files too! And it gets even worse: first you have to wait for pick to slowly search ALL THE FILES (within limiting message range you may have given it if you have any idea and often I did not), and then you wait for scan to slowly readdir everything, and THEN you finally get your results. What I really want (and I doubt I'm alone) is scan lines as soon as a message is a HIT, so I can interrupt when the message I'm looking for comes across, without waiting for any further work. I offered a patch at some point to have scan read message numbers through standard input so you could `pick ... | scan -` so you could skip the second readdir AND get early interrupt. I'm not sure what happened to that one, but I'm not disappointed it didn't make it in. It's really only a half measure. On a modern filesystem on SSD in 2022 maybe nobody cares anymore. But over NFS in 2010, this mattered a lot. I don't see how we can argue that "the UNIX philosophy" means every command has to repeat the same expensive work and also they're not allowed to share code. Well they already share plenty of code... just not that code! :) I think the philosophy case falls down in other ways as well. pick does three things: 1. resolve user query to message numbers 2. by default, print the message numbers without formatting 3. optionally, store in sequence scan already implements a subset of #1 (as do all commands accepting message specifications). #2 is pick duplicating scan's job (scan -format '%(msg)'). #3 is pick duplicating mark's job. I think I still have an old patch lying around somewhere that teaches pick to scan when `-scan` option is given. I definitely plan to resurrect that patch soon because... Guess another case where having scan repeat pick's expensive work comes into play. Yep, it's over there on that imap-prototype branch. I'll be sure to bring numbers into that discussion when the time comes :) Thanks!
