Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Tue, Aug 25, 2009 at 06:39:29AM -0400, Barry Warsaw wrote: So you can explain why, in theory and in practice, obfuscation doesn't work. But the user base will (stubbornly, if you like) refuse to accept your logic. As usual, Stephen hits the nail on the head. I can't disagree with much in Rich's post, and yet it's likely that we'll still obfuscate and/or conceal email addresses in the archives because users will demand it. You can and should educate them, but this is not a battle I wish to fight because I think we can't win it. I've thought this over for quite some time (obviously), and have done some homework elsewhere to ascertain whether both Stephen's and your (Barry's) comments are accurate. They are. Very much so. There now exists a cargo cult mentality which insists that obfuscation has some anti-spam/security value, in spite of overwhelming evidence and experience that conclusively proves it has none whatsoever. (As an aside, not to either of you but in response to other comments in the thread, I'm well aware of the concept of defense-in-depth and practiced it years before the term became common. But for any measure to be part of defense-in-depth, it must first qualify as a defense, albeit perhaps a weak or half-hearted one. Address obfuscation obviously fails to clear this bar, even as low as it's set.) I don't know how to dispell this widely-shared delusion. It may not be possible, at least in the near future. And it's probably not the role of Mailman's (or any other software package's) developers to tackle this issue; there's only so much policy that can be promulgated by code. I think perhaps the best that can be done is to insert a statement in Mailman's documentation indicating that this measure is provided for people who want to use it, but that it really has zero value. Whether or not y'all want to do that is of course up to you, but I think at least a nod to reality in the documentation might get some of the better mail system admins to at least start thinking about the issue. And maybe that's the best that can be done for now. ---Rsk ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
--On 29 August 2009 04:19:58 + Julian Mehnle jul...@mehnle.net wrote: Bob Puff wrote: That's the logical progression of that argument, and is the good reason why obfuscation or even removal of parts is not only a good idea, its a necessity. Exposing raw email addresses in their normal form is real low-hanging fruit. Regardless of what I think, my clients will cry bloody murder if emails leak out. I had one person recently google their email address, and found a link to an archive file that should have been private. I had removed all links to the archives, but somehow Google found it, indexed it, and the guy threatened me with bloody murder if I didn't take it down. Sheesh. There's robots.txt, you know? If this is just about user outcry, then robots.txt will fix it (since all legitimate search engines honor it). But, the legitimate search engines aren't the problem. It's the harvesters, which probably don't honour robots.txt. If you prevent Google from indexing the archive, then you just hide the problem. -Julian -- Ian Eiloart IT Services, University of Sussex 01273-873148 x3148 For new support requests, see http://www.sussex.ac.uk/its/help/ ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
--On 31 August 2009 10:15:43 -0700 C Nulk cn...@scu.edu wrote: I am pretty sure allowing the raw email addresses to be available is going to go over like a lead balloon here. Here, too. Our site would probably deploy some other mailing list software. Anything (however minor) to help protect the users/clients email addresses is helpful despite what others think. All the published research evidence is that email address obfuscation helps a lot. At a University site, most student email addresses won't be published anywhere EXCEPT in our mailing list archives. That means that the best way for spammers to acquire student email addresses is to harvest their addresses from our list archives. Students get a lot less spam than academic staff whose addresses appear all over the place. So much so that everyone who's ever fallen foul of phishing here has been a staff member, despite being outnumbered 10:1 by students. -- Ian Eiloart IT Services, University of Sussex 01273-873148 x3148 For new support requests, see http://www.sussex.ac.uk/its/help/ ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
I am pretty sure allowing the raw email addresses to be available is going to go over like a lead balloon here. Anything (however minor) to help protect the users/clients email addresses is helpful despite what others think. It is fine if someone considers the obfuscation that Mailman uses is trivial, however, anything I can do to make it harder or more computationally time-invested to get the email address is better than giving it away. Sure bots are out there but if what I do helps slow down someones system to make them look at it (and hopefully get rid of the bot), then great. But at least give me the choice to be able to do it. I happened to like Barry's (?) earlier comment about the send me this message link. Or maybe send my message to the original poster link where you can click on the link, compose your message, and send it through Mailman all without the original sender's address. Mailman or whatever process can figure out the original sender and pass on the your message. Yes, I know it is more work that is why we have computers :) As for using robots.txt, hmm, it is not the legitimate search engines I care about, it is the search engines/crawlers that do not respect my robots.txt file that I care about. If I had an effective way to consistently identify those non-legitimate crawlers, I would add what I needed to drop them into my firewall as I recognized them. Chris Julian Mehnle wrote: Bob Puff wrote: That's the logical progression of that argument, and is the good reason why obfuscation or even removal of parts is not only a good idea, its a necessity. Exposing raw email addresses in their normal form is real low-hanging fruit. Regardless of what I think, my clients will cry bloody murder if emails leak out. I had one person recently google their email address, and found a link to an archive file that should have been private. I had removed all links to the archives, but somehow Google found it, indexed it, and the guy threatened me with bloody murder if I didn't take it down. Sheesh. There's robots.txt, you know? If this is just about user outcry, then robots.txt will fix it (since all legitimate search engines honor it). -Julian ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/cnulk%40scu.edu Security Policy: http://wiki.list.org/x/QIA9 ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 29, 2009, at 12:21 AM, Jeff Breidenbach wrote: Yes. It is critical to keep user perception in mind. Specifically, if you don't keep email addresses off the global search engines, there will be a deluge of vocal complaints from users who neither care about nor understand the technical aspects. That can be as simple as robots.txt configuration, or as fancy as using a captcha based system to reveal addresses like the one offered by reCaptcha. But my main point is you need to cover the user perception angle almost independtly from the core technical aspects of anti-harvesting. For the record, I prefer keeping data as unadulterated as possible because it helps interoperability. But we also need to keep users happy. Trust me, I'm keenly aware of this as I probably get 3x the nasty hate mail that most of you get. I try to be nice and patient and that usually calms people down. :) Mailman will always still collect the raw data for messages sent to the list. There are legitimate uses for allowing outsiders access to that data (say, the list is moving and you want to migrate the archives), so I think we always want to support this. The question is how much if any of the raw data does the general public get access to? -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 29, 2009, at 1:10 AM, Bernd Siggy Brentrup wrote: On Fri, Aug 28, 2009 at 18:03 -0400, Barry Warsaw wrote: What I'm thinking is that there should be a send me this message link in the archive, which gets you a copy as it was originally sent to the list. That let's you jump into a conversation as if you'd been there originally. Another use case comes up when coming back from temporarily disabled delivery where you want to participate in an ongoing discussion. I've always dreamed of a ml-requ...@listdomain function that retransmits any messages in References to me. It's clear that MM has to delegate this to the archiver. I dream of a 'vacation' setting where you could tell Mailman the start and end dates of your delivery stop and then those messages would just be forwarded to you (perhaps as a digest) upon your return. Almost exactly like what the US Post Office does IRL. Something like this would be cool for another reason. Assuming you could trust the long term storage at the archive site (enough) it would eliminate the last reason why I locally archive any public mailing list messages. ... indicating your internet connection is by orders of magnitude better than mine :) And yet, it's never enough! :) To get on topic again: regarding address obfuscation in the archives, I noted: - obfuscate by default, - the archive admin may choose not to obfuscate but this fact will be stated clearly on every archive page à la: Email addresses are visible per choice of mailto:archiv-owner. Yep, something like that. -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 29, 2009, at 3:01 AM, Stephen J. Turnbull wrote: Barry Warsaw writes: What I'm thinking is that there should be a send me this message link in the archive, which gets you a copy as it was originally sent to the list. That let's you jump into a conversation as if you'd been there originally. I don't understand. Do you mean the raw message received by the list, or the processed message as distributed by the list? The former means you don't have RFC 2369 headers, etc. I'm not sure I understand what the efficacy of the latter is; does address-munging happen only in the archives? I find it hard to believe that could be at all effective, except for what I would think is an unusual case (a closed- subscription list with public archives). Yes, address munging only happens in the HTML archives and in the outgoing queue processor. Mailman keeps a copy of the raw received message which for MM2 is only in the mbox file, but for MM3 will be in a message store. Let's say I just joined the XEmacs development mailing list after a long absence. I find a message in the archive from two years ago that is relevant to an issue I'm having. I'd like to follow up to that message using my normal mail toolchain, but I found the archive page through Google. I should be able to click on a link on that page, enter my email address (perhaps through some validation dance, or subject to a request governor) and then the message -- as it was originally copied to the list membership -- would show up in my inbox, exactly as if I were a list member at the time. Now I can hit 'reply' and inject myself seamlessly into that 2 year old thread. -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 31, 2009, at 1:15 PM, C Nulk wrote: I am pretty sure allowing the raw email addresses to be available is going to go over like a lead balloon here. Anything (however minor) to help protect the users/clients email addresses is helpful despite what others think. It is fine if someone considers the obfuscation that Mailman uses is trivial, however, anything I can do to make it harder or more computationally time-invested to get the email address is better than giving it away. Sure bots are out there but if what I do helps slow down someones system to make them look at it (and hopefully get rid of the bot), then great. But at least give me the choice to be able to do it. Agreed. I happened to like Barry's (?) earlier comment about the send me this message link. Or maybe send my message to the original poster link where you can click on the link, compose your message, and send it through Mailman all without the original sender's address. Mailman or whatever process can figure out the original sender and pass on the your message. Yes, I know it is more work that is why we have computers :) The difficult part about the latter is that I hate web interfaces for reading/composing email (Gmail included). I want to use my mail reader for that! As for using robots.txt, hmm, it is not the legitimate search engines I care about, it is the search engines/crawlers that do not respect my robots.txt file that I care about. If I had an effective way to consistently identify those non-legitimate crawlers, I would add what I needed to drop them into my firewall as I recognized them. Agreed. -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Barry Warsaw wrote: Let's say I just joined the XEmacs development mailing list after a long absence. I find a message in the archive from two years ago that is relevant to an issue I'm having. I'd like to follow up to that message using my normal mail toolchain, but I found the archive page through Google. I should be able to click on a link on that page, enter my email address (perhaps through some validation dance, or subject to a request governor) and then the message -- as it was originally copied to the list membership -- would show up in my inbox, exactly as if I were a list member at the time. Now I can hit 'reply' and inject myself seamlessly into that 2 year old thread. As long as the mailing list name/address hasn't migrated/changed in the interim... ...perhaps the original message munged to ensure current accuracy of the to/cc/reply-to fields? -Dale ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 31, 2009, at 3:00 PM, Dale Newfield wrote: Barry Warsaw wrote: Let's say I just joined the XEmacs development mailing list after a long absence. I find a message in the archive from two years ago that is relevant to an issue I'm having. I'd like to follow up to that message using my normal mail toolchain, but I found the archive page through Google. I should be able to click on a link on that page, enter my email address (perhaps through some validation dance, or subject to a request governor) and then the message -- as it was originally copied to the list membership -- would show up in my inbox, exactly as if I were a list member at the time. Now I can hit 'reply' and inject myself seamlessly into that 2 year old thread. As long as the mailing list name/address hasn't migrated/changed in the interim... Good point. ...perhaps the original message munged to ensure current accuracy of the to/cc/reply-to fields? Not sure I understand; can you elaborate? -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Barry Warsaw wrote: On Aug 31, 2009, at 1:15 PM, C Nulk wrote: I am pretty sure allowing the raw email addresses to be available is going to go over like a lead balloon here. Anything (however minor) to help protect the users/clients email addresses is helpful despite what others think. It is fine if someone considers the obfuscation that Mailman uses is trivial, however, anything I can do to make it harder or more computationally time-invested to get the email address is better than giving it away. Sure bots are out there but if what I do helps slow down someones system to make them look at it (and hopefully get rid of the bot), then great. But at least give me the choice to be able to do it. Agreed. I happened to like Barry's (?) earlier comment about the send me this message link. Or maybe send my message to the original poster link where you can click on the link, compose your message, and send it through Mailman all without the original sender's address. Mailman or whatever process can figure out the original sender and pass on the your message. Yes, I know it is more work that is why we have computers :) The difficult part about the latter is that I hate web interfaces for reading/composing email (Gmail included). I want to use my mail reader for that! Actually, I had more of a mailto style link in mind that sends the message to the list (run by Mailman naturally) and as part of the body/subject include an encrypted form of the message id (providing it is unique). You would use your mail client to read/compose. Maybe something similar to a list's listname-bounces address but with the message id could be done. Don't know. Mailman would receive your message, decrypt the message id, look up the message, then forward your message to the original sender. I am not particularly fond of web interfaces for reading/composing email. Well, maybe when I travel overseas without a laptop, then it is minimally okay. As for using robots.txt, hmm, it is not the legitimate search engines I care about, it is the search engines/crawlers that do not respect my robots.txt file that I care about. If I had an effective way to consistently identify those non-legitimate crawlers, I would add what I needed to drop them into my firewall as I recognized them. Agreed. -Barry Now, totally off-topic, anyone have a recommendation for a book on learning Python so I am no longer truly dangerous, just slightly. Thanks, Chris ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Barry Warsaw wrote: Now I can hit 'reply' and inject myself seamlessly into that 2 year old thread. As long as the mailing list name/address hasn't migrated/changed in the interim... Good point. ...perhaps the original message munged to ensure current accuracy of the to/cc/reply-to fields? Not sure I understand; can you elaborate? We can tell from a mailing list's configuration what the distribution address should be, but I guess we don't know what previous addresses it had, so it's not as simple as I was thinking to do this munging (I was thinking just a search/replace). Maybe the appropriate modifications from the original message would be to add as a To address the current list address iff it does not appear in the To or CC addresses in the archived message (and to re-set ReplyTo, if reply-to-munging is set). -Dale ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 31, 2009, at 4:48 PM, David Champion wrote: I'm going to embracing and extend something Barry suggested in private mail. He suggested a list setting that permits signed-in list subscribers to download raw archives if they have some 'archive-approved' status. What if that is a three-way switch: approved, unapproved, and blacklisted? New subscribers would always be unapproved. An unapproved subscriber who successfully posted to the list, clearing any approval mechanisms in place and subject to a list configuration option, would get approved for raw archive access. (Automatic posting-equals-approval would not be desirable for all lists, but it would for many.) An approved user could be blacklisted by moderator action or by an automated moderation filter. Coming off blacklist status would require manual action by the moderator. And there could be a form in the application to request approval or de-blacklisting, of course. Launchpad's mailing lists have a very similar concept, although it's not used for access to the archives. The concept there is called standing and currently has four levels: excellent, good, poor, and unknown. You start out with unknown standing, but after you prove yourself (in much the same way as you describe above), you get to be in good standing, which gives you other benefits, such as being able to email a list you're not on without moderation. You can't get to excellent standing on your own and there are currently no benefits of excellent over good standing. Poor standing is much like your blacklist idea. The way I look at it is that Launchpad prototyped this concept and I do think it could be useful in Mailman itself. -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
* On 31 Aug 2009, Barry Warsaw wrote: Mailman will always still collect the raw data for messages sent to the list. There are legitimate uses for allowing outsiders access to that data (say, the list is moving and you want to migrate the archives), so I think we always want to support this. The question is how much if any of the raw data does the general public get access to? It seems clear that there are legitimate use cases for raw archives, so I'll skip the justifications and just address how we can balance between transparency and security. I'm going to embracing and extend something Barry suggested in private mail. He suggested a list setting that permits signed-in list subscribers to download raw archives if they have some 'archive-approved' status. What if that is a three-way switch: approved, unapproved, and blacklisted? New subscribers would always be unapproved. An unapproved subscriber who successfully posted to the list, clearing any approval mechanisms in place and subject to a list configuration option, would get approved for raw archive access. (Automatic posting-equals-approval would not be desirable for all lists, but it would for many.) An approved user could be blacklisted by moderator action or by an automated moderation filter. Coming off blacklist status would require manual action by the moderator. And there could be a form in the application to request approval or de-blacklisting, of course. -- -D.d...@uchicago.eduNSITUniversity of Chicago ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Barry Warsaw writes: Let's say I just joined the XEmacs development mailing list after a long absence. Hey, welcome back! Do you plan to return to Supercite maintenance?wink I find a message in the archive from two years ago that is relevant to an issue I'm having. I'd like to follow up to that message using my normal mail toolchain, but I found the archive page through Google. Sure, that's a valid use case. I'm not sure that it couldn't be handled by an appropriate mailto URL, though. And I suspect it's less common than the case of private messages (no evidence, just introspection). ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Barry Warsaw writes: What I'm thinking is that there should be a send me this message link in the archive, which gets you a copy as it was originally sent to the list. That let's you jump into a conversation as if you'd been there originally. I don't understand. Do you mean the raw message received by the list, or the processed message as distributed by the list? The former means you don't have RFC 2369 headers, etc. I'm not sure I understand what the efficacy of the latter is; does address-munging happen only in the archives? I find it hard to believe that could be at all effective, except for what I would think is an unusual case (a closed-subscription list with public archives). ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Wed, Aug 26, 2009 at 10:57:06AM +0100, Ian Eiloart wrote: There's recently published research which suggests that simple obfuscation can be effective. Concealment, presumably, is more effective. At http://www.ceas.cc/ you can download Spamology: A Study of Spam Origins http://www.ceas.cc/papers-2009/ceas2009-paper-18.pdf I'm composing a combined reply to all of the comments here, but wish to reply to this single point separately. This paper seems well-intentioned, but has some very serious problems -- any one of which is sufficient to dismiss its conclusions entirely. Let me just enumerate a few of them; I'll spare you the entire list. 1. The authors presume that they can tell that an address has been harvested *and* added to at least one spammer database (or not) by observing spam sent to it. But that's wrong: we know that many addresses are harvested and never spammed, or not spammed for a very long time (as in years). Conversely, many addresses are spammed that have *never* been harvested. And some addresses that are harvested are spammed, but not because they were harvested. [1] And some addresses are picked up by routine/ordinary web crawlers, and then subsequently spammed, but not by the people running those crawlers. [2] This invalidates their measurement technique. 2. There's a major methodology error here: We began by registering a dedicated domain for this project, which we hosted on servers in our department. We know that some spammers -- the competent ones, who are the ones that matter -- use suppression lists based not just on domains, but TLDs, IP addresses, network allocations, ASNs, NS records, MX records, etc. We further know that anything tracing to a .edu or a network allocation/ASN associated with a .edu is quite likely to appear on those suppression lists. (This is an old tradition among spammers. Not all of them follow it, but quite a few do.) This also invalidates their measurement technique. 3. Statistics from any single domain are often wildly skewed one way or another. For example: I happen to host three domains which have the same name, but in three different TLDs. Everything else about them is exactly the same: NS, MX, web content, valid email addresses, etc. The spam they receive varies over three orders of magnitude. 4. And then there's this: it doesn't cover use of the single largest current vector for address harvesting -- zombied systems. No discussion of contemporary address harvesting techniques can even be begun without considering this. It's like writing a paper on tides without factoring in the moon's gravitation. [3] (I checked to see if perhaps this paper's publication predated the rise of the zombies earlier this decade, but it's from 2009.) To put it another way: yes, there are still address harvesters using the techniques that these researchers were looking for. But these harvesters are outdated and unimportant; they're only used by spammers who don't have the expertise and resources to do better. And not only is that class of spammer is steadily shrinking, that's NOT the class of spammer we need to worry about, as it's quite easy to block just about all their traffic whether they have valid addresses or not. (C'mon, these are people who can't decode rskATgsp.org, do you really think they constitute a serious threat?) So like I said above, I'll spare you points 5-N, but they're similar. None of what I've said here is new or novel: it's common knowledge among experienced people working in the field. I think perhaps in the future that people trying to conduct this kind of research should spend a few years reading spam-l and other similar lists before diving in. The bottom line is that (a) the numbers they've produced have no meaning and (b) their conclusions are all wrong. Notes: [1] As an example: conside j...@example.com, and let's suppose that it's been deliberately exposed to one method of harvesting because it's published at http://www.example.com. If spam arrives, then it may be because the address was harvested by a web crawler and added to a spammer database -- or it may be because joe is very common LHS string and thus one that spammers are very likely to try in *any* domain. Note that while spammers' list of such likely LHS were quite limited years ago, they're not any more: spammers now have the resources to try all known and all plausible LHS strings if they wish. And they are: check your logs. You may be surprised at which LHS strings are being tried: what was computationally infeasible a decade ago is now routine. [2] It's not difficult to figure out who's running a web crawler: just setting up a web site, making sure it's linked to, waiting, and then analyzing logs will reveal a candidate list. It's somewhat more work to figure out which of those crawler operations can be broken into, but it has significant advantages: it allows one to mine all their data without the expense/hassle of
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 25, 2009, at 8:30 AM, Stephen J. Turnbull wrote: 2) is more interesting. What kinds of uses are we talking about? You see a message in an archive from three years ago and you want to contact the OP about it? Why not just follow up and contact the mailing list? For all the reasons why Reply-To Munging Considered Harmful. What I'm thinking is that there should be a send me this message link in the archive, which gets you a copy as it was originally sent to the list. That let's you jump into a conversation as if you'd been there originally. Something like this would be cool for another reason. Assuming you could trust the long term storage at the archive site (enough) it would eliminate the last reason why I locally archive any public mailing list messages. Do you want to be contacted off-list for on-list topics? Well, things like an email forwarding service could solve that, although I think it's not worth the effort as much as the first use case. What other kinds of legitimate third party uses does obfuscation/concealment prevent? Obfuscation is a minor annoyance, but concealment is problematic in cases where the email is the identity, eg, matching list posts to issue tracker IDs. For example, I signed up for and log in to Launchpad as step...@xemacs.org, but I have to tell bzr that my ID is stephen-xemacs. Wow, that's transparent. But at least it's guessable. Getting from Stephen J. Turnbull email concealed to stephen-xemacs is not going to be easy if you don't already know me. True. -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Something else that occurs to me. If we accept that obfuscation is worthless and stop doing it, then there's no reason we shouldn't make the raw mbox files available for anyone to download. Mailman used to do this, but we removed the feature due to user outcry. Now you can download the gzip monthly .txt files, but they are sanitized. If we stop obfuscating, is there any reason not to make the raw messages available for download? -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
That's the logical progression of that argument, and is the good reason why obfuscation or even removal of parts is not only a good idea, its a necessity. Exposing raw email addresses in their normal form is real low-hanging fruit. Regardless of what I think, my clients will cry bloody murder if emails leak out. I had one person recently google their email address, and found a link to an archive file that should have been private. I had removed all links to the archives, but somehow Google found it, indexed it, and the guy threatened me with bloody murder if I didn't take it down. Sheesh. Bob -- Original Message --- From: Barry Warsaw ba...@python.org To: Rich Kulawiec r...@gsp.org Cc: mailman-developers@python.org Sent: Fri, 28 Aug 2009 21:46:01 -0400 Subject: Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3 Something else that occurs to me. If we accept that obfuscation is worthless and stop doing it, then there's no reason we shouldn't make the raw mbox files available for anyone to download. Mailman used to do this, but we removed the feature due to user outcry. Now you can download the gzip monthly .txt files, but they are sanitized. If we stop obfuscating, is there any reason not to make the raw messages available for download? -Barry --- End of Original Message --- ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
the archives, but somehow Google found it, indexed it, and the guy threatened me with bloody murder if I didn't take it down. Yes. It is critical to keep user perception in mind. Specifically, if you don't keep email addresses off the global search engines, there will be a deluge of vocal complaints from users who neither care about nor understand the technical aspects. That can be as simple as robots.txt configuration, or as fancy as using a captcha based system to reveal addresses like the one offered by reCaptcha. But my main point is you need to cover the user perception angle almost independtly from the core technical aspects of anti-harvesting. For the record, I prefer keeping data as unadulterated as possible because it helps interoperability. But we also need to keep users happy. -Jeff ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Fri, Aug 28, 2009 at 18:03 -0400, Barry Warsaw wrote: What I'm thinking is that there should be a send me this message link in the archive, which gets you a copy as it was originally sent to the list. That let's you jump into a conversation as if you'd been there originally. Another use case comes up when coming back from temporarily disabled delivery where you want to participate in an ongoing discussion. I've always dreamed of a ml-requ...@listdomain function that retransmits any messages in References to me. It's clear that MM has to delegate this to the archiver. Something like this would be cool for another reason. Assuming you could trust the long term storage at the archive site (enough) it would eliminate the last reason why I locally archive any public mailing list messages. ... indicating your internet connection is by orders of magnitude better than mine :) To get on topic again: regarding address obfuscation in the archives, I noted: - obfuscate by default, - the archive admin may choose not to obfuscate but this fact will be stated clearly on every archive page à la: Email addresses are visible per choice of mailto:archiv-owner. Regards Siggy -- O ascii ribbon campaign - stop html mail - www.asciiribbon.org+ |48 days until|Open Source in Northern Germany: www.free-it.org| |www.Ubucon.de|tech contact: bsb-at-free-dash-it-dot-de| +--- ceterum censeo javascriptum esse restrictam + signature.asc Description: Digital signature ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
--On 25 August 2009 21:02:01 + Julian Mehnle jul...@mehnle.net wrote: Bob Puff wrote: You are presuming too much on spammers as a whole. I've dealt with a couple spammers, and they just used some tools they got online that search for usern...@domain.something. Everything else is ignored. I don't for a minute doubt that the advanced spammers will snag anything and everything no matter how strange it is obfusticated (sp?). But there are a LOT of low-tech spammers still out there, and there is enough low hanging fruit for them that this little bit we are discussing can be over their head. It's not. Spammers usually don't do address harvesting themselves nowadays, but outsource it to botnets (just like they outsource the spamming itself to botnets) that are running kind of off the shelf software tailored to the task. Today, as a spammer you go out and buy those services in online shops, paying by credit card. And parsing localpart at domain is among the most trivial things current harvester modules do. Any wanna-be spammers who still run their garage business with self written tools are pretty much meaningless in terms of magnitude. If anything, this kind of obfuscation is an inconvenience to legitimate users, but certainly not to spammers. -Julian There's recently published research which suggests that simple obfuscation can be effective. Concealment, presumably, is more effective. At http://www.ceas.cc/ you can download Spamology: A Study of Spam Origins http://www.ceas.cc/papers-2009/ceas2009-paper-18.pdf They say Surprisingly, even simple email obfuscation approaches are still sufficient today to prevent spammers from harvesting emails. and Commonly-used email obfuscation techniques are offering protection (for now). It is common practice to replace the conventional @ in email addresses by an AT in order to defeat email harvesting. We found that the spammers are still not parsing simple obfuscations as of now. However, one should not count on the protection offered by such simple obfuscation schemes, for they are trivial to defeat. Of course, list posts hang around for a long time, and may be mirrored (eg by Google caching). Therefore, concealment seems more sensible than obfuscation. Perhaps a captcha could be used to reveal sender addresses, for example. The paper might be more interesting for its discussion of techniques for detecting (eg with honeypots) and defeating harvesters. -- Ian Eiloart IT Services, University of Sussex 01273-873148 x3148 For new support requests, see http://www.sussex.ac.uk/its/help/ ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
On Aug 25, 2009, at 1:35 AM, Stephen J. Turnbull wrote: Rich Kulawiec writes: Pretending that address obfuscation in mailing list [or newsgroup] archives will have any meaningful effect on this process gives users a false sense of security and has zero anti-spam value. You're missing the point. Our (often non-technical) users demand this feature. Even our technical audience (see Siggy's parallel post for example) perceives benefits from obfuscation, based on empirical tests. So you can explain why, in theory and in practice, obfuscation doesn't work. But the user base will (stubbornly, if you like) refuse to accept your logic. As usual, Stephen hits the nail on the head. I can't disagree with much in Rich's post, and yet it's likely that we'll still obfuscate and/or conceal email addresses in the archives because users will demand it. You can and should educate them, but this is not a battle I wish to fight because I think we can't win it. The costs of obfuscation are 1) increased code complexity; 2) denying legitimate third party uses. 1) is not insignificant. Regexp filters are tricky/impossible to get 100% right, but not too bad to get maybe 90% right. They are low fidelity because scanning headers isn't enough; people embed email addresses in all kinds of weird places in the body and HTML filtering is brain hurty. Obfuscation techniques will be busted so only concealment is future proof. This is all pretty boring coding though. 2) is more interesting. What kinds of uses are we talking about? You see a message in an archive from three years ago and you want to contact the OP about it? Why not just follow up and contact the mailing list? IOW, if there was an easy way to inject yourself into an old thread, perhaps one that was created before you joined the list, wouldn't that cover a large part of the use case? Do you want to be contacted off-list for on-list topics? Well, things like an email forwarding service could solve that, although I think it's not worth the effort as much as the first use case. What other kinds of legitimate third party uses does obfuscation/concealment prevent? -Barry PGP.sig Description: This is a digitally signed message part ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Barry Warsaw writes: 2) is more interesting. What kinds of uses are we talking about? You see a message in an archive from three years ago and you want to contact the OP about it? Why not just follow up and contact the mailing list? For all the reasons why Reply-To Munging Considered Harmful. Do you want to be contacted off-list for on-list topics? Well, things like an email forwarding service could solve that, although I think it's not worth the effort as much as the first use case. What other kinds of legitimate third party uses does obfuscation/concealment prevent? Obfuscation is a minor annoyance, but concealment is problematic in cases where the email is the identity, eg, matching list posts to issue tracker IDs. For example, I signed up for and log in to Launchpad as step...@xemacs.org, but I have to tell bzr that my ID is stephen-xemacs. Wow, that's transparent. But at least it's guessable. Getting from Stephen J. Turnbull email concealed to stephen-xemacs is not going to be easy if you don't already know me. ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9
Re: [Mailman-Developers] Proposed: remove address-obfuscation code from Mailman 3
Bob Puff wrote: You are presuming too much on spammers as a whole. I've dealt with a couple spammers, and they just used some tools they got online that search for usern...@domain.something. Everything else is ignored. I don't for a minute doubt that the advanced spammers will snag anything and everything no matter how strange it is obfusticated (sp?). But there are a LOT of low-tech spammers still out there, and there is enough low hanging fruit for them that this little bit we are discussing can be over their head. It's not. Spammers usually don't do address harvesting themselves nowadays, but outsource it to botnets (just like they outsource the spamming itself to botnets) that are running kind of off the shelf software tailored to the task. Today, as a spammer you go out and buy those services in online shops, paying by credit card. And parsing localpart at domain is among the most trivial things current harvester modules do. Any wanna-be spammers who still run their garage business with self written tools are pretty much meaningless in terms of magnitude. If anything, this kind of obfuscation is an inconvenience to legitimate users, but certainly not to spammers. -Julian signature.asc Description: This is a digitally signed message part. ___ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9