RE: filtering HTML tags from email
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Mike Hauber Sent: Wednesday, February 23, 2005 4:19 AM To: freebsd-questions@freebsd.org Subject: Re: filtering HTML tags from email Just after destroying the headers in who-knows-how-many emails (backed up... whew!), I finally realized that piping the messages though html2text (or lynx or w3m) was probably not such a great idea after all. :) This is something that really should be implemented as part of kmail itself (it would help to remain compatable with both maildir/mbox). I'll continue to be frustrated with html2text for a while (it's a pretty cool tool), and who knows... Mayhaps I'll figure out a reasonable way to set it up so that everything is done automatically. Mike, why are you torturing yourself when http://www.mimedefang.org/ does this? Afraid of Sendmail? Ted ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: filtering HTML tags from email
Mike Hauber wrote: Mutt saves to a temp file then calls the following command: lynx -localhost -dump %s where '%s' is the temporary file you saved it to. You could also just pipe it to the following: lynx -localhost -dump -stdin the -localhost argument prevents lynx from simply following links external to your machine - helpful to avoid generating hits for unscrupulous spammers that get paid for hits on a URL. Just make sure lynx is installed. Lou Okay, so to be sure, there is no filter (as of yet) to simply open an email file, strip the HTML tags, and resave it? I'm not complaining, as this may actually be something I'm capable of creating myself. (I'll make this my first python project. :) ) I'm just making sure I'm not missing anything obvious before I start working on it. It's irritating to spend time on something only to find out that it's already been done. You probably could do it also with procmail + lynx (or w3m) during the delivery process. Another possibility is to have the following entries in your ~/.mailcap file, which converts html, doc and rtf to plain text. text/html; w3m -dump -T text/html; copiousoutput; application/msword; antiword %s; copiousoutput application/rtf; rtfreader %s; copiousoutput As for your python script: I don't think that just stripping everything matching the following expressions is correct because they might appear in non html emails, too: .* \/.* (perl syntax). At least, you'd need a list of valid html tags, i.e. a regular grammar for html: b | /b | i | /i | ... (BNF notation). While this is not too hard to implement (and possibly a good project to learn a new programming language), this would be too much work for something that can be achieved easier with existing tools (that is, for me, personally ;-) Simon pgpgUlVMmAaoT.pgp Description: PGP signature
Re: filtering HTML tags from email
On Wednesday 23 February 2005 04:43 am, Simon Barner wrote: You could also just pipe it to the following: lynx -localhost -dump -stdin Lou Okay, so to be sure, there is no filter (as of yet) to simply open an email file, strip the HTML tags, and resave it? I'm not complaining, as this may actually be something I'm capable of creating myself. (I'll make this my first python project. :) ) You probably could do it also with procmail + lynx (or w3m) during the delivery process. Another possibility is to have the following entries in your ~/.mailcap file, which converts html, doc and rtf to plain text. text/html; w3m -dump -T text/html; copiousoutput; application/msword; antiword %s; copiousoutput application/rtf; rtfreader %s; copiousoutput Simon Just after destroying the headers in who-knows-how-many emails (backed up... whew!), I finally realized that piping the messages though html2text (or lynx or w3m) was probably not such a great idea after all. :) This is something that really should be implemented as part of kmail itself (it would help to remain compatable with both maildir/mbox). I'll continue to be frustrated with html2text for a while (it's a pretty cool tool), and who knows... Mayhaps I'll figure out a reasonable way to set it up so that everything is done automatically. Thanks for the feeds. Mike ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
filtering html tags from email
Without going through the hassle of setting up proxy servers, isn't there a way that one can filter out html tags from a message (say, pipe the email through the filter from kmail for instance?) Perhaps I'm looking too hard for it, but I didn't see anything in the ports tree except for /mail/nohtml. I tried to pipe a html message through nohtml.py from kmail, but doesn't seem to work (although I'm getting no errors from kmail's filter log). Any ideas? Thx. Mike ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: filtering html tags from email
On 02/22/05 11:16 PM, Mike Hauber sat at the `puter and typed: Without going through the hassle of setting up proxy servers, isn't there a way that one can filter out html tags from a message (say, pipe the email through the filter from kmail for instance?) Perhaps I'm looking too hard for it, but I didn't see anything in the ports tree except for /mail/nohtml. I tried to pipe a html message through nohtml.py from kmail, but doesn't seem to work (although I'm getting no errors from kmail's filter log). Any ideas? Thx. Mutt saves to a temp file then calls the following command: lynx -localhost -dump %s where '%s' is the temporary file you saved it to. You could also just pipe it to the following: lynx -localhost -dump -stdin the -localhost argument prevents lynx from simply following links external to your machine - helpful to avoid generating hits for unscrupulous spammers that get paid for hits on a URL. Just make sure lynx is installed. Lou -- Louis LeBlanc FreeBSD-at-keyslapper-DOT-net Fully Funded Hobbyist, KeySlapper Extrordinaire :) Please send off-list email to: leblanc at keyslapper d.t net Key fingerprint = C5E7 4762 F071 CE3B ED51 4FB8 AF85 A2FE 80C8 D9A2 Habit is habit, and not to be flung out of the window by any man, but coaxed down-stairs a step at a time. -- Mark Twain, Pudd'nhead Wilson's Calendar pgpwHmOTn9WRn.pgp Description: PGP signature
Re: filtering HTML tags from email
On Wednesday 23 February 2005 12:50 am, Louis LeBlanc wrote: On 02/22/05 11:16 PM, Mike Hauber sat at the `puter and typed: Without going through the hassle of setting up proxy servers, isn't there a way that one can filter out html tags from a message (say, pipe the email through the filter from kmail for instance?) Perhaps I'm looking too hard for it, but I didn't see anything in the ports tree except for /mail/nohtml. I tried to pipe a html message through nohtml.py from kmail, but doesn't seem to work (although I'm getting no errors from kmail's filter log). Any ideas? Thx. Mutt saves to a temp file then calls the following command: lynx -localhost -dump %s where '%s' is the temporary file you saved it to. You could also just pipe it to the following: lynx -localhost -dump -stdin the -localhost argument prevents lynx from simply following links external to your machine - helpful to avoid generating hits for unscrupulous spammers that get paid for hits on a URL. Just make sure lynx is installed. Lou Okay, so to be sure, there is no filter (as of yet) to simply open an email file, strip the HTML tags, and resave it? I'm not complaining, as this may actually be something I'm capable of creating myself. (I'll make this my first python project. :) ) I'm just making sure I'm not missing anything obvious before I start working on it. It's irritating to spend time on something only to find out that it's already been done. Thanks, Mike ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]