Fighting Spam with Python
Are you as mad about spam as I am? Are you frustrated with the pessimism and lack of progress these last two years? Do you have faith that an open-source project can do better than the big companies competing for a lock-in solution? If so, you might be interested in the Open-Mail project. I'm writing some scripts to check incoming mail against a registry of reputable senders, using the new authentication methods. Python is ideal for this because it will give mail-system admins the ability to experiment with the different methods, and provide some real-world feedback sorely needed by the advocates of each method. So far, we have SPF and CSV. See http://purl.net/macquigg/email/python for the latest project status. I welcome anyone who is interested in helping, expecially if you have some experience with mail transfer programs, like Sendmail or Postfix, or spam filtering programs, like SpamAssassin. My Python may not be the best, so I welcome suggestions there also. We need to make these scripts a model of clarity. -- Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Fighting Spam with Python
On Thu, 25 Aug 2005 10:18:37 -0400, Peter Hansen <[EMAIL PROTECTED]> wrote: >David MacQuigg wrote: >> Are you as mad about spam as I am? Are you frustrated with the >> pessimism and lack of progress these last two years? Do you have >> faith that an open-source project can do better than the big companies >> competing for a lock-in solution? If so, you might be interested in >> the Open-Mail project. >> >> I'm writing some scripts to check incoming mail against a registry of >> reputable senders, using the new authentication methods. Python is >> ideal for this because it will give mail-system admins the ability to >> experiment with the different methods, and provide some real-world >> feedback sorely needed by the advocates of each method. So far, we >> have SPF and CSV. See http://purl.net/macquigg/email/python for the >> latest project status. > >You might find www.spambayes.org of interest, in several ways. Integration of a good spam filter is one of our top priorities. Spambayes looks like a good candidate. The key new features needed in a spam filter are the ability to extract the sender's identity (not that of the latest forwarder), and to factor into the spam score the reputation of that identity. We could use some help on this integration. I guess I should have said a little more about the Open-Mail project. We are not focused on developing new authentication or filtering methods, but rather, providing a platform that will bring these pieces together and allow the mail admin to chose which methods are used and in what order. Interoperability has been the main barrier to widescale use of authentication. Python is superb at gluing these pieces together. In the flow we envision, the spam filter is the final process, used only on the 5% that is hard to classify. 80% will get an immediate reject. 15% will get an immediate accept without filtering, because the sender is authenticated and has a good reputation. Eventually, all reputable senders will join the 15%, and the 5% will shrink to where we can ignore it. -- Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Fighting Spam with Python
On Thu, 25 Aug 2005 13:22:53 -0400, François Pinard wrote: >[David MacQuigg] > >> The key new features needed in a spam filter are the ability to >> extract the sender's identity (not that of the latest forwarder), and >> to factor into the spam score the reputation of that identity. > >This will only work if your system is immune to forgeries, while being >largely widespread. Stopping forgery is what the new authentication methods are all about. Getting these methods widely and effectively used is our big challenge, and one that I hope to accomplish with my efforts. There are a bunch of pieces that need to work together more smoothly. That's where Python comes in. There are some challenging constraints, like the system has to work without government regulation. I've got a first draft of a website for open-mail.org - temporarily at http://purl.net/macquigg/email/registry Suggestions are welcome. >> In the flow we envision, the spam filter is the final process, used >> only on the 5% that is hard to classify. 80% will get an immediate >> reject. 15% will get an immediate accept without filtering, because >> the sender is authenticated and has a good reputation. Eventually, >> all reputable senders will join the 15%, and the 5% will shrink to >> where we can ignore it. > >It's fun to read statistics about a vision! :-) The 80% is real. http://messagelabs.com/emailthreats As to how the remaining 20% will split, that's a guess, but one that I think is realistic. See http://www.spamhaus.org/effective_filtering.html for comparable numbers using only IP blacklists and spam filtering. The 5% still needing filtering will be those senders that don't offer any authentication or that authenticate with an identity that has not yet acquired a reputation. >> >You might find www.spambayes.org of interest, in several ways. > >Spambayes is surprisingly good as it already stands. I haven't used Spambayes, but my experience with Spamnix (an offshoot of Spam Assassin) is that statistical filters always have a few false rejects. In my case, that's about two per week. The solution to this problem is a reliable system allowing receivers to determine the identity and reputation of an unknown sender. Then we can safely ignore the spam. -- Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Fighting Spam with Python
On Fri, 26 Aug 2005 10:36:28 -0400, François Pinard <[EMAIL PROTECTED]> wrote: >[David MacQuigg] > >> Getting these methods widely and effectively used is our big >> challenge, and one that I hope to accomplish with my efforts. > >I wish one of these methods, either yours or one of these few others >which were developed and proposed in the recent years, will succeed. I don't have a method, and that is a key part of the strategy. The Registry is intended to support all methods. My main technical contribution, if you can call it that, is to figure out how we can tie these methods into a system where not all participants are using the same method. ( An inter-operability protocol, if you need a fancy name.) >It might be useful, for someone involved like you are (thanks for all of >us!), that you make a survey of those others, trying to understand why >they failed to acquire popularity, not repeating the same errors if any. The main reason for the current failure is that the effort to achieve a common authentication standard has degenerated into a war. I did try to find information on other attempts at setting up a Registry/Clearinghouse of reputation information. There has been an effort by Spamhaus to establish such a registry, but they were counting on senders to support it. That seems to me a fatal flaw. Our plans are to have *receivers* support the registry via subscription fees. Senders will need an incentive, and that will be provided by receivers who use the Registry to clear reputable mail, and send the rest to a spam filter. There are also some successful proprietary systems, like IronPort Senderbase, that I think are similar, but I don't know the details. You have to pay them big bucks for a "spam appliance". -- Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Licensing and Other Questions
On Sat, 27 Aug 2005 01:35:58 +0300, Christos Georgiou <[EMAIL PROTECTED]> wrote: >Your method is/will_not be free (as in beer), as hinted in >http://www.ece.arizona.edu/~edatools/home/email/registry/Form-Sender01.htm >. *That* is a drawback similar to the licensing of the Microsoft's >Sender/Caller-ID scheme. Why not support open, free standards? These are fees for services, not license fees. I don't know how you could miss that. The code is offered under the Python licence, which is the most unrestrictive of any license I know about. One of my goals is to provide an open-source version of what big companies are now paying millions for - spam appliances with proprietary methods. On Fri, 26 Aug 2005 23:20:05 GMT, [EMAIL PROTECTED] (John J. Lee) wrote: >[David, in an earlier email] >> reject. 15% will get an immediate accept without filtering, because >> the sender is authenticated and has a good reputation. Eventually, >> all reputable senders will join the 15%, and the 5% will shrink to >> where we can ignore it. > >Two questions you seem to be implicitly assuming particular answers >to: Is widespread authentication a good thing? Does it solve any >problem not solved by Bayesian filtering plus good mail client >support? My first reaction is to answer "no" to both questions, so to >regard your effort as harmful. Might be interesting to hear why you >think it's a good thing, though. I really didn't intend for this to be a discussion of the merits of filtering vs authentication. I worry this will be a long discussion, with no satisfactory conclusion, so I suggest we move these topics to one of the email security forums. My conclusion, after participating in many such discussions, is that both filtering and authentication are necessary tools, and a complete system should have both. -- Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question: Sub-interpreters for CAD program
On 24 Aug 2005 13:48:13 -0700, "sonicSpammersGoToHellSmooth" <[EMAIL PROTECTED]> wrote: >Hi all, > >I'm a newbie to Python, so I have a question about writing an >application that also has a scripting ability. I'm thinking of Eric3 >as an example. It's written in Python, but it also has an interpreter >window. The user doesn't have access (I don't think...) to all the >internal stuff that makes the IDE work. > >In my case I'd like to write a CAD program which allows the user to >write Python scripts, and to provide an API to do CAD stuff, manipulate >parameters, circuits, layouts, simulations, etc. The user should not >have access to the internals of the CAD program itself. The CAD >program is written primarily in Python, with possibly C++ extensions >for speed critical stuff. > >There is another posting currently asking about how many interpreters >are needed with how many thread states each. Since this is new to me, >can someone please explain how this sort of thing is "supposed" to >work, from a high level? > >I have a strong EE and hardware background (hence my need to write a >CAD program that doesn't piss me off), but not a CS background. Sounds like we have similar backgrounds and motivations. I have a project started along these lines, but I haven't had time to work on it for the last few months. http://www.ece.arizona.edu/~edatools/ Project page EDA Tools Projects: An Open-Source Platform for Front-End IC Design cdp_tut01-a1.zip cdp_tut01-a1.tar.gz The goal of this project is an easily-learned, universal, open-source, circuit design platform that will allow IC designers to use whatever tools they want for design entry, simulation, and display of results. The platform should provide a simple GUI, basic services such as storage of tool setups, and should define a simple, standard interface for each class of tool. Most of the work will be in documenting the design and construction of the platform, using a simple scripting language ( Python ) and GUI toolkit ( Qt ) so that others may easily follow the pattern and extend the platform to support new and more varied tools. -- Take a look also at the MyHDL link from the main page. This is a similar effort for digital design. Mine is mostly analog. The discouraging thing about the EDA tools situation is that no matter how loudly design engineers complain about the poor quality of the proprietary tools they are using, there is very little interest in participating in an open-source project. They just can't see how it would ever do what their expensive tools do now. There is a similar lack of interest in the academic community. None of this is likely to lead to publications in scholarly journals. -- Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question: Sub-interpreters for CAD program
On Sat, 27 Aug 2005 16:56:03 -0500, Terry Hancock <[EMAIL PROTECTED]> wrote: >On Saturday 27 August 2005 03:21 am, David MacQuigg wrote: >> There is a similar lack of interest in the academic community. None >> of this is likely to lead to publications in scholarly journals. > >I'm confused by what you meant by this. Are you saying that academics >are afraid of using or creating open source CAD tools, or that they have >a lack of interest in tools development, because it won't generate papers >(directly anyway)? It seems like a lack of interest in tools development, because there are no new fundamental principles, sophisticated math, or anything that could help directly and in the short term to get a publication. There is probably also a perception, shared with engineers in industry, that the complexity of these tools is inherent in the task. It's OK for a full-time engineer to spend a few months learning the intricacies of a poorly-designed scripting language that works with just one tool, but not appropriate for students. My hope is that we can get a few good projects to demonstrate the utility of Python in doing sophisticated designs with simple tools. Then we will have a foothold in the Universities. Next will be small companies that can't afford a CAD department with 10 engineers dedicated to tool setup. -- Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question: Sub-interpreters for CAD program
On 27 Aug 2005 17:00:07 -0700, "sonicSpammersGoToHellSmooth" <[EMAIL PROTECTED]> wrote: >Cool, I went to the UofA for my MS in ECE, 2000. I did my theses under >Chuck Higgins. -- >http://neuromorph.ece.arizona.edu/pubs/ma_schwager_msthesis.pdf > >The tools we had were constantly underwhelming me, so I've been >thinking for years that a properly designed new toolset for students >should be marketable, etc. I'll take a look at your site (although I >think I may have come across it before.) A toolset for students is exactly what we need. Whether that turns into a marketable product is a question for later. At this point, we need a few people with a strong motivation to help students. Sounds like you might have what it takes. Send me an email if you are interested. -- Dave -- http://mail.python.org/mailman/listinfo/python-list