Excellent case study of the problem, thank you all for the discussion. I believe right now that I will change all img src tags that are non-attached will be modified or disabled. I'll work something up soon about this.
KAM ----- Original Message ----- From: "Paul Murphy" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, May 20, 2004 11:57 AM Subject: RE: [Mimedefang] Want to modify "read-receipt" img tags in mail Scenarios: A. Normal e-mail will have no images at all No action required. B. Some systems will send HTML mail, but normally without images No action required. C. Occasionally, someone will use HTML mail with a "stationery" effect, which is normally an image tiled across the background, or a logo Detect that the image is in the background, and ignore it. D. Self-referential e-mail, with images attached and referenced in the HTML Links to attached images are OK, so ignore it. E. HTML e-mail with links to off-site images This is where it gets interesting. See below. There are two approaches which spring to mind, one of which is simple. Let's start with the difficult one. You could in theory parse all HTML parts of messages, and identify every off-site link. Having done that, you could make a request for each link, preferably through a site-wide web-cache (so that the user's content is pre-cached if it ever gets to them), and then decide what to do based on what comes back. If the result is an image, analyze it using GD, then perhaps you would decide that any image of less than 100 pixels gets replaced by a auto-generated image of the same size which has a 5x5 "X" in the top-left corner. Potential problems - you have to follow all links, not just image links, as a link to a CGI script can just as easily be used to track where a message has gone. Also, if a newsletter contains an "unsubscribe" link, you run the risk of activating it... The alternative is much simpler - if the HTML contains any URI which is off-site, then change the message to be a plain-text body with the original message as an attachment. In the plain text part, provide a list of the sites referenced, and a warning that the off-site links could be used for tracking. Implement a whitelist system for this, so messages from approved sites, or to users who elect to opt out of the filtering, are passed unchanged. The other option, much as I hate to say it, is to try a smarter mail client, such as Outlook 2003, of which they say: "To help protect privacy and combat Web beacons, Outlook 2003 can block the download of external content from the Internet. If an e-mail message tries to connect unannounced to a Web server, Outlook blocks that connection until you decide to view the content. This feature also helps prevent you from viewing potentially offensive messages. If you're on a slow connection, it allows you to decide whether an image warrants the time required to download it. " Best Wishes, Paul. __________________________________________________ Paul Murphy Head of Informatics Ionix Pharmaceuticals Ltd 418 Science Park, Cambridge, CB4 0PA Tel. 01223 433741 Fax. 01223 433788 _______________________________________________________________________ DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. If you have received this email in error please contact the sender or the Ionix IT Helpdesk on +44 (0) 1223 433741 _______________________________________________________________________ ---------------------------------------------------------------------------- ---- > _______________________________________________ > Visit http://www.mimedefang.org and http://www.canit.ca > MIMEDefang mailing list > [EMAIL PROTECTED] > http://lists.roaringpenguin.com/mailman/listinfo/mimedefang > _______________________________________________ Visit http://www.mimedefang.org and http://www.canit.ca MIMEDefang mailing list [EMAIL PROTECTED] http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

