On 11/18/2014 06:35 AM, Hal wrote: > So for any new messages from now on I want my list to work this way: > > 1) HTML formatted postings should be converted to plain text before > reaching other members.
In Mailman's Content filtering msection you want the following: filter_content: Yes filter_mime_types: empty pass_mime_types: multipart text/plain text/html filter_filename_extensions: irrelevant, default list OK pass_filename_extensions: empty collapse_alternatives: Yes convert_html_to_plaintext: Yes filter_action: as desired, this will only apply to a message which contains no text/html or text/plain part. > 2) HTML formatted postings can retain their formatting for the archive > (I believe the archive is in the HTML format anyway?), but if it only > archives whatever is sent to list members I don't mind. The important > thing is that members receive plain text messages. What will be archived is what was delivered to list members. > 3) Since many people have their email programs set by default to send in > HTML these days I just want Mailman to do its filtering, then continue > by sending the posting as plain text without any moderator request or > alerting the sender. Settings in 1) do that. > 4) I'd like to block all attachements (list members should only receive > plain text files). > 40kb is already set for Max_message_size (in "General options" within > the list administration web interface) which seems to have worked fine > (as far as I know). 'attachments' is an imprecise word, but settings in 1) will do what you want. > Furthermore I understand that Filter_filename_extensions (in the > "Content filtering" section) in addition removes any attachements based > on specific filename *extensions* regardless of their file size? > > I see exe, bat, cmd and a bunch of other filetypes I've never heard of > (geared towards Windows/DOS users I suppose -I'm a Mac user) are listed, > but I suppose I could block .zip and those pesky .vcf/.vcard and > "winmail.dat" files the same way. They will all be removed anyway unless they have a MIME Content-Type of text/plain or text/html which is unlikely. > When such extensions are encountered, are they just removed from the > messages while the message posting itself is passed on to list members, > or is the whole posting stopped for approval first? They are just removed. > I'm thinking out loud here, so feel free to chime in for better ideas, > but I'm thinking there are two kind of attachement groups which need > different actions to be taken: > > Deliberate attachements: zip files, gif/jpg images etc. which a poster > wants to share. The message/attachement should be stopped from reaching > the list and an email sent to the poster with a "your message has been > blocked. Please resend your message, this time without an attachement" > type of message. Content filtering will just remove them. > Accidental attachements: winmail.dat, .vcf or .vcard an so on. Many > users don't know (as with HTML postings) that their email program is set > up to send this stuff. IMHO those attachements don't have anything to do > with the actual content of their postings, so Mailman should just remove > the attachement(s), then pass on the rest of the message to the list. winmail.dat is really more of a 'deliberate' attachment. It is a message part with MIME type application/vnd.ms-tnef which is a Microsoft Outlook/Exchange 'transport neutral encapsulation format' way of encoding attachments. .vcf and .vcard 'attachments' have Content-Type text/vcard or possibly application/vcard+json or application/vcard+xml. In any case, since these do not have Content-Type text/plain or text/html, they will be removed. > Having said that, have I understood things correctly by setting up my > "Content filtering" options as follows? (based on what you've said and > what I've read here: > http://wiki.list.org/pages/viewpage.action?pageId=4030684): > > Edit_filter_content: YES > Filter_mime_types: (left blank) > Pass_mime_types: multipart > message/rfc822 > text/plain > text/html > filter_filename_ext.: exe > bat > cmd > com > pif > scr > vbs > cpl > zip > dat > vcf > vcard > pass_filename_ext.: (left blank) > Collapse_alternatives: YES > conv_html_to_plaintext: YES > Filter_action: DISCARD Maybe. The only difference between this and 1) above is message/rfc822. If I forward a message to your list as an 'attachment', do you want to remove that forwarded message from my post or do you want to accept the plain text or possibly HTML converted to plain text parts of that forwarded message? If you want to remove it, leave message/rfc822 out of the list, if you want to accept the result of applying contentent filtering to it, put message/rfc822 in the list. >>> Failing that, is there a way I could have the (currently private) >>> archive have a filter before HTTP access? >> >> You could create your own CGI or other web process to access the >> archives and present them any way you want. > > Being ignorant on the subject, what kind of pre-written CGI script > should I try to find (i.e. "search engine to web archive gateway" or > something like that?). I doubt very much that you'll find anything pre-written that will meet your needs. > You previously suggested htdig (http://www.htdig.org/) with your patches > for allowing my visitors to search through both the Mailman archives and > my website. To be clear, htdig is a search engine that can index and search all or a portion of your web site. The patches developed by Richard Barrett and currently supported by me add a search form to the main archive table of contents page for a list and invoke htdig to do the search. This search is only of the archive of that list. > Assuming this is a more ready-to-use solution than the other > search engines out there, For a general search of your web site, probably not a good assumption. > are there features I will be missing out on > (e.g. the ability to use CSS and Ajax for making its search results > appear more in line with the rest of my website) and is it still secure? > I've read that malicious code can sometimes be entered as search phrases > and damage the database if the search engine isn't using "parametrized > queries". I don't think that malicious search phrases is an issue with htdig, but I don't know for sure that it isn't. It probably wouldn't be too difficult to incorporate CSS into the search results pages, but I've never tried it. Ajax might be more problematic. -- Mark Sapiro <m...@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org