Folks,
Thanks for the really great feedback. I'm about to check in a new version of Scrubber.py that addresses the many issues brought up. Apologies for not quoting everything. - permission problems: fixed - problems with multipart/mixed containing gif, html, and jpeg parts: fixed. - text/html decoding: there's now a new global variable ARCHIVE_HTML_SANITIZER which can be 0, 1, or a string. # This variable defines what happens to text/html subparts. They can be # stripped completely, escaped, or filtered through an external program. The # legal values are: # 0 - Strip out text/html parts completely, leaving a notice of the removal in # the message. If the outer part is text/html, the entire message is # discarded. # 1 - Remove any embedded text/html parts, leaving them as HTML-escaped # attachments which can be separately viewed. Outer text/html parts are # simply HTML-escaped. # # The value can also be a string, in which case it is the name of a command to # filter the HTML page through. The resulting output is left in an attachment # or as the entirety of the message when the outer part is text/html. The # format of the string must include a "%(filename)s" which will contain the # name of the temporary file that the program should operate on. It should # write the processed message to stdout. ARCHIVE_HTML_SANITIZER = '/usr/bin/lynx -dump %(filename)s' This seems to work pretty well (will provide examples shortly). As with the rest of Scrubber, it's a bit of a kludge, but perhaps not horrible. It could definitely use more testing by you guys. It's actually rather difficult to get Pipermail to /not/ HTML-escape attachments, so I'm punting on that for now. Plus, I just feel it's way too dangerous to support. - storing in get_filename() if available: fixed, and I've also implemented the idea of sticking each message's attachments in a separate subdir off of archives/private/mylist/attachments. The subdir is based on the Message-ID: and files inside there are uniquified if necessary. - problems with the attachment url: what we really needed was a more elaborate PUBLIC_ARCHIVE_URL format string. It now accepts %(hostname)s as well as %(listname)s, and the former gets interpolated with the list's web host name (as looked up in the inverted VIRTUAL_HOSTS dictionary, and defaulting to DEFAULT_URL_HOST). Watch for checkins shortly. -Barry _______________________________________________ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers