I have a list of spam HTML messages in pipermail archives. I need to clear out their content (no great problem there), but I also want to clean the corresponding messages in the raw mbox file (zap subject, message body, etc, but leave a placeholder message so future archive regeneration doesn't mess up article numbers). Looking at one of these messages (HTML source), I see nothing like a message id which would allow me to unambiguously identify the corresponding raw message. Does something exist? If not, what heuristics have people developed to perform this mapping?
Thanks, Skip Montanaro ------------------------------------------------------ Mailman-Users mailing list [email protected] https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
