Or Re: [Mailman-Developers 10417] Improving the archives I would like to interject and highlight some use cases for stable and predictable IDs. For us, "message IDs" are directly used both by people and ignorant programs. Our mailing lists serve as a permanent and concise record of our discussions, decisions, and operations, and we find it invaluable to be able to refer to individual messages in a simple and memorable way: "message 1210 in the calibration list", say. Other people can then easily jot that info down or directly find the message. Some message IDs even become shorthands for a particular topic or decision. We have also added trac InterWiki templates pointing into our mail archives (as listname:number), which encourages desirable cross-referencing (PRs, wiki pages, and SVN change logs can refer to mail messages, just as wiki pages could always refer to changesets and PRs, etc, etc.) But trac InterWiki templates can only interpolate $1,$2,... arguments into strings, and could not possibly calculate anything based on the _content_ of the messages. Globally unique IDs, hashed IDs, etc., are very appealing from various CS-y and techie points of view, but are simply not memorable to humans or knowable by dumb external programs. I think as much, or more, effort should be put into delivering a straightforwardly useable naming scheme as goes into making an arbitrary message recoverable from anywhere. Basically, "friendly URLs" should be a primary requirement, not an optional afterthought for careless geeks like me to get wrong later....
We long ago added an extremely simple ID handoff between MM 2.1.8 and pipermail, and though imperfect it has served us well. Basically, we hijacked the .post_id member in mailman (otherwise basically unused, and mysteriously a floating point number); CookHeaders stuffed it into a X-Mailman-Sequence-ID header line, and AfterDelivery incremented it. In turn, pipermail uses the header to feed a sequence ID into make_article, and the message is squirreled away as $mailinglist/all/%d.html. There are a few other minor matters (e.g. post_id was added to Decorators, a couple of templates were changed, we lost having 'ls' sort chronologically [did we have to add .last and .prev to the HyperDatabase classes?]), but it really was a minor bit of work. And for stability, as long as the archive files aren't lost, pipermail rebuilds should yield the same URLs even if junk messages have been deleted. [Oh, we did also add a "never rotate" policy to our archives, but that is finesseable. ] As an aside on other discussions, can you get away without using Message-ID or Date? I.e., aren't those just more of those tokens which were standardized back before the Internet got tricky enough to invalidate the standards? Mailing lists serialize incoming messages, and so can generate their own unique and trustworthy IDs. "UUIDs" would work, but if you can trust yourself to generate them, consecutive integers provide minimal, order-preserving, perfect hashing, too! Anyhow, we have found that people will enthusiastically refer by name to individual messages within mail archives if they can. - craig _______________________________________________ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp