>>>>> "Brad" == Brad Knowles <[EMAIL PROTECTED]> writes:
Brad> At 2:03 PM +0900 2005-09-28, Stephen J. Turnbull wrote: >> Why archivers don't use Message-Id for the URL, I don't know. Brad> Because some MUAs generate message-ids that are likely Brad> to collide. Can we stop pandering to the broken mailers, please? Are we not hackers? We know how to handle collisions. Here's the algorithm: 1) Look for a unique ID in the (X-)List-Archive-Message-ID field. If not found: a) Generate a unique ID according to the usual algorithm as if the post were about to be sent from the archive host. b) Add it to the header in the (X-)List-Archive-Message-ID field. 2) Extract the message ID from the message. If none, set the program variable equal to the ID generated in 1, and (optionally) add it as Message-ID to the message's header. 4) Generate the URL for the archived message based on *Message-ID*. 5) Check for collision. 6) If there is a collision, make a directory (could be a file-system directory, could be just an HTML file, could be a digest message) with the URL generated in 4. Generate URLs for the colliding messages based on Message-ID plus List-Archive-Message-ID, and include them in the directory. Conforming implementations MAY also extract MUA information and make nasty comments about the broken MUAs, their implementers, and their users to go with the directory. If Message-ID == List-Archive-Message-ID, go to 1a. At this point a conforming implementation MAY mail /vmunix to its implementer, who obvious snafu'ed. 7) PUT the colliding messages at those URLs. Rationale: 1. You could actually derive an URN from this: archived-message://list-archive.your.org/MESSAGE-ID. 2. The URL is unique and will persist across regeneration of the archive as long is the message is present. 3. People who use conformant software implemented competently should be given precedence. 4. Users who don't subscribe to the archiver's client but somehow get their hands on a message ID can use Google to find it (and the rest of the thread). 5. People who use software that doesn't conform will suffer. Brad> For some time now, I've been arguing that they should use a Brad> hash of the relevant information (maybe all the headers, Brad> maybe just selected headers, maybe the entire message, Brad> whatever is reasonable to assume will survive), making sure Brad> to at least include the value of the "Date:", "Message-ID:", Brad> and "Received:" headers as part of that input. This gives 1 and 2, but not 3, 4, and 5. (No, you can't generate a Google search item from knowledge of the algorithm because you don't necessarily have the Received headers.) Seems like overkill for Step 1 of the algorithm, too. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp