Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-18 Thread Patrick Ben Koetter
* Barry Warsaw ba...@list.org:
 On Apr 12, 2013, at 08:28 AM, Patrick Ben Koetter wrote:
 
 I think it would be real nice to have a MILTER interface at LMTP server level
 to allow mail modification as required. Mailman runs in large environments 
 and
 all the 'large organizations' I have worked asked my team and me to customize
 how mail is processed. MILTER is a great interface to modify mail.
 
 Do you mean a hook in Mailman's LMTP server process?  I thought about that in
 my previous message but decided not to mention it because it's not clear to me
 how performant Mailman's current smtpd-based (read: async) LMTP server is.
 What I mean is, I'm not sure how much additional work we want the LMTP server
 to do.
 
 It would be cool if someone did some performance testing of the LMTP
 implementation, and it would be cool if someone tried to add some hooks into
 that server.  It might also be interesting to look into alternative
 implementations.  Another reason to push for getting Mailman 3 onto Python 3
 would be the ability to leverage Guido's Tulip work for better async IO
 performance.

We did a quick test and blew 10.000 messages into Mailman 3's LMTP server. The
hardware was/is a Pentium 2, 2 GB RAM machine with desktop discs - way below
current server hardware.

It took the test 25 min. to submit all messages:

real25m10.041s
user0m4.872s
sys 0m7.700s

That makes an average of 

400 msg/min or
6,6 msg/sec

Robert, who did the tests, Ralf and I agree that this is way enough for LMTP
server performance.

If we add a MILTER interface, the milter applications hooked into the LMTP
servers receiving process will slow down the income rate. The impact depends
on what the specific application tests or what kind of modification it applies
to the message. In general MILTERs are designed to work in memory only. No
message will need to be written to a disc, which usually is the most expensive
operation during mail processing.

At the moment we (at sys4.de) don't think it needs further testing, but we
offer to do so if you have reason to do so.

p@rick

-- 
[*] sys4 AG
 
http://sys4.de, +49 (89) 30 90 46 64
Franziskanerstraße 15, 81669 München
 
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
Aufsichtsratsvorsitzender: Joerg Heidrich
 
___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9

Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-12 Thread Patrick Ben Koetter
* Stephen J. Turnbull step...@xemacs.org:
 Sreyanth writes:
 
   3. Anti-spam / anti-abuse in Mailman.
 
 A couple of people have mentioned anti-spam, and it's a frequently
 requested feature.  Nevertheless, I don't think we should spend Google
 money and mentor time on it.

I concur.

 1.  Mailman is the wrong place to do filtering.  It's equally
 effective, normally covers more messages, and is somewhat more
 efficient in resource usage to do it at the MTA.

Spam-filtering is expensive. It should be done only once - at sender level and
not for each recipient of a mailing list.

We could let Mailman do it when the mail enters, but what would be the gain?
There's plenty of software out there that already knows how to battle spam.

Even worse! In some countries - take Germany for example - you either reject
spam at SMTP session level while the sending client is still there and will
take notice or you MUST deliver it - else you break the law because you took
reponsibility for transport, but supressed the message.

Mailman is part of a mail system, but it I don't expect it will ever become
the component that will communicate directly with a remote (spam sending)
client.

All the work to add an anti-spam feature in Mailman would be 'useless' to
countries with laws as I described above.

BUT ...

I think it would be real nice to have a MILTER interface at LMTP server level
to allow mail modification as required. Mailman runs in large environments and
all the 'large organizations' I have worked asked my team and me to customize
how mail is processed. MILTER is a great interface to modify mail.


 2.  Any new algorithms *should* be made available at the MTA level
 where they can be best put to use by more people.  This implies
 something that either plugs into existing filters (such as
 spamassassin) or MTAs (ie, milters) rather than a Handler.
 3.  Adapting existing filters is generally pretty trivial: you write a
 10-line custom Handler that pipes it to an external process.  This
 isn't big enough for a GSoC project.
 4.  To the extent that new algorithms are involved, I have doubts that
 Mailman mentors have the kind of expertise needed to really help
 with such a project (I could be wrong, but I certainly don't know
 much about that kind of text processing, and I don't know that
 anybody else in Mailman has expertise in it).
 
 On the other hand, I don't know which project in GSoC would be a
 better place for it.  It's possible to argue that Mailman is a
 reasonable place for it, but IMHO we probably shouldn't.

I hate to stand in the way of someone, who wants to contribute to OSS, but
IMHO we shouldn't either.


 Regarding anti-abuse, we would like to do something about problems
 like backscatter.  However, I have to wonder how much *code* (vs
 *specification* and *design*) is needed for those problems.  If the
 project is really spec-heavy, it's probably not really what Google has
 in mind (based on comments on the mentors' list, not on any official
 Google pronouncements, though).

Has anyone ever mentioned SNMP as a feature for Mailman?

p@rick


-- 
[*] sys4 AG
 
http://sys4.de, +49 (89) 30 90 46 64
Franziskanerstraße 15, 81669 München
 
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
Aufsichtsratsvorsitzender: Joerg Heidrich
 
___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9

Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-12 Thread Stephen J. Turnbull
Terri Oda writes:

  Writing individual pipelines may be trivial, but making a user interface 
  for managing said pipelines is non-trivial.  Right now, our pipeline 
  management interface is there's a text box in postorius that lets you 
  choose a pipeline.  It's not even a dropdown, and you may be screwed if 
  you make a typo which is obviously not how I want it when we
  release. ;)

That's a more general issue (oh, I see you noticed that! :-), and I
have no problem with doing something about it -- indeed, I'd be more
than happy to (co-)mentor it because I just *love* custom Handlers.
Here's what I would do:

1. Get the list of handlers active to the list.
2. Append the list of inactive handlers from Mailman/Handlers (the
   site's list, not the distributed handlers).
3. The UI is table with rows containing a checkbox for active
   handler (the row should be greyed out if it's inactive), an
   ordinal (numerical), and the handler name (gold star for popping up
   a tooltip with a detailed description/docstring on mouseover).
4. Users can either change the numbers (error checked for uniqueness),
   with a partial order on standard handlers -- if the partial order
   is violated (including a missing handler like ToOutgoing) the
   user is warned; or (platinum star) drag the handlers into the order
   they like (with same checks on the partial order).

  I see a potential project timeline going something like this:
  
  A. make a set of custom Mailman 3 Handlers for some well-known existing 
  anti-spam/anti-malware software.  (Maybe 2-3 weeks of work here, finding 
  2-4 reasonable pieces of software, setting them up, writing the 
  handlers, and testing them)

One week for that work, it's all in the FAQ already I suspect.

  B. make an interface in Postorius so list admins can 
  enable/disable/reorder these and any whitelisting happening within 
  mailman.  This should involve making an interface in Postorius that 
  gives admins the ability to change the Pipeline being used, and will 
  likely involve a small amount of user testing to make sure said 
  interface doesn't have risk of disastrous results if the administrator 
  does the wrong thing.  (Another 3-4 weeks of work including user 
  testing, unit tests, and documentation)

You think the design above will take more than two days (one to learn
how to do DD to reorder a list) to code, and 4 to document and test?
(I'm assuming Mailman2 kinds of pipeline APIs are already available.
If new REST API is needed, OK, 3 weeks total.)

  C. Figure out how to set up some sort of packager that can install 
  handlers + antispam software so that the site admin has an easy way to 
  set these up if requested. (Another 3-4 weeks of work, including testing 
  any scripts on a few different OSes and extensive documentation)

OK, yes, getting PyPI down for the Handlers themselves (while these
*could* be delivered with Mailman, I think it would be more valuable
to have a standard PyPI delivery protocol for 3rd party Handlers) will
likely take that much time, and indeed one needs to deal with OS PMS.

  Do feel free to disagree with me, of course, Stephen.

I am indeed a curmudgeon about the antispam stuff.  I don't think the
first release of Mailman 3 should contain an attractive nuisance like
serious antispam in Mailman (vs antispam in the MTA).  I'll try to
keep such negative thinking to one paragraph per post, though. :-)

  Or complain that I'm using the lure of antispam to get someone
  solve my user interface for pipelines problem, which I totally
  am. ;)

While I do think that an initial implementation is probably a total of
about 2 weeks worth of work, I suspect that one could riff on the
theme (hi, Barry, like that metaphor?) for a couple more weeks, and
robust disaster recovery (saving off the old pipeline and restoring
looks simple enough, but Mr. Murphy is lurking, I'm sure -- in
particular, if we're going to allow through-the-web pipelines, we need
to guarantee that received mail will not get lost if the pipeline is
horked) could account for a couple more weeks.
___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-12 Thread Barry Warsaw
On Apr 12, 2013, at 01:44 AM, Stephen J. Turnbull wrote:

A couple of people have mentioned anti-spam, and it's a frequently
requested feature.  Nevertheless, I don't think we should spend Google
money and mentor time on it.

From the core's perspective, I tend to agree that there is some interesting
things we'd like to add here, but it's probably not enough work to justify a
GSoC slot.  I'm not sure if additional ui work can pad that out.

I also agree that in general, we want to encourage sites to push anti-spam
defenses into the MTA as much as possible.  The counter argument is that we
get plenty of requests from folks who have no control over their MTA and want
to be able to configure Mailman to help reduce spam.  I think the following
avenues would be interesting to pursue.

* Assume the MTA is doing filtering, and that messages will fall into three
  categories: known bad (these get dropped at the MTA), known good (these flow
  through), unsure.  For the latter, the message will probably be marked in
  some way, e.g. a header with a spam score, and it would be good if Mailman
  has some facility (e.g. a rule) to parse that header and make disposition
  decisions based on that value.  One thing Mailman can do that the MTA cannot
  is allow for human intervention for disposition.

* Provide an option for messages to detour into spam filters like spamassassin
  during Mailman message processing.  This probably means a rule which calls
  out to SA or equivalent, and stores the score in some metadata.  A rule hit
  might mean that the message has a spam score higher than a threshold, in
  which case processing jumps to a chain which can discard, reject, or hold th
  message.

Regarding anti-abuse, we would like to do something about problems
like backscatter.  However, I have to wonder how much *code* (vs
*specification* and *design*) is needed for those problems.  If the
project is really spec-heavy, it's probably not really what Google has
in mind (based on comments on the mentors' list, not on any official
Google pronouncements, though).

Agreed.

-Barry
___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-12 Thread Sreyanth
Hi all! Thank you very much for awesome discussion here!

On Fri, Apr 12, 2013 at 1:22 AM, Terri Oda te...@zone12.com wrote:

  On 13-04-11 10:44 AM, Stephen J. Turnbull wrote:

 1.  Mailman is the wrong place to do filtering.  It's equally
 effective, normally covers more messages, and is somewhat more
 efficient in resource usage to do it at the MTA.
 2.  Any new algorithms **should** be made available at the MTA level
 where they can be best put to use by more people.  This implies
 something that either plugs into existing filters (such as
 spamassassin) or MTAs (ie, milters) rather than a Handler.
 3.  Adapting existing filters is generally pretty trivial: you write a
 10-line custom Handler that pipes it to an external process.  This
 isn't big enough for a GSoC project.
 4.  To the extent that new algorithms are involved, I have doubts that
 Mailman mentors have the kind of expertise needed to really help
 with such a project (I could be wrong, but I certainly don't know
 much about that kind of text processing, and I don't know that
 anybody else in Mailman has expertise in it).

 I agree.​​


 Writing individual pipelines may be trivial, but making a user interface
 for managing said pipelines is non-trivial.  Right now, our pipeline
 management interface is there's a text box in postorius that lets you
 choose a pipeline.  It's not even a dropdown, and you may be screwed if you
 make a typo which is obviously not how I want it when we release. ;)

 I see a potential project timeline going something like this:

 A. make a set of custom Mailman 3 Handlers for some well-known existing
 anti-spam/anti-malware software.  (Maybe 2-3 weeks of work here, finding
 2-4 reasonable pieces of software, setting them up, writing the handlers,
 and testing them)

 B. make an interface in Postorius so list admins can
 enable/disable/reorder these and any whitelisting happening within
 mailman.  This should involve making an interface in Postorius that gives
 admins the ability to change the Pipeline being used, and will likely
 involve a small amount of user testing to make sure said interface doesn't
 have risk of disastrous results if the administrator does the wrong thing.
 (Another 3-4 weeks of work including user testing, unit tests, and
 documentation)

 C. Figure out how to set up some sort of packager that can install
 handlers + antispam software so that the site admin has an easy way to set
 these up if requested. (Another 3-4 weeks of work, including testing any
 scripts on a few different OSes and extensive documentation)

 D. If there's any time leftover, implement some clever new filter (and
 appropriate Handler) that makes use of the list information itself (e.g.
 subscriber list, archives, etc.) to make better spam decisions. (at this
 point, you've got maybe 2 weeks left in the GSoC timeline)

 This really looks great! Almost what I actually expected from a project
like this.
But, like Stephen and Barry pointed out, I am unsure as to how far this
comes under GSoC's purview.
​​


 I think that constitutes enough useful-to-mailman work to justify the
 google funds, gets us some customizable spam filtering (which as you say,
 is a frequently requested feature), but doesn't turn us into something
 we're not.  That's why anti-spam made this year's gsoc list even though
 we've always said do it in the MTA and I'm not about to change that
 policy in general.

 Do feel free to disagree with me, of course, Stephen. Or complain that I'm
 using the lure of antispam to get someone solve my user interface for
 pipelines problem, which I totally am. ;)

  Terri

 Thanks for such a great timeline Terri. I dont have issues with this. As
Stephen and Barry said, I even liked the idea of having a MILTER interfaced
at LMTP level.

On a overall positive note, I am quite convinced that giving the admin of
the list with great flexible options to choose from (and as Barry pointed
out, why should everything be exposed to the admin via Postorius?, which
may not be of the admin's interest! ). I believe this could be make a nice
GSoC project, but with many spam filters which people are already
acquainted with, I am not sure how far people tend to use this feature.

Also, I would like to hear more about : Boilerplate stripper AND Better
content-filtering / handling error messages.
​Boilerplate stripping is trivial to understand. But, can anyone elaborate
on Better content-filtering / handling error messages?
I strongly believe that Boilerplate stripping will be a cool thing to have
in Mailman and obviously, who would not want to welcome better
content-filtering / error handling techniques on board?​



-- 
*Yours Sincerely*
*
*
*Mora Sreyantha Chary*
*Computer Engineering '14*
*National Institute of Technology Karnataka*
*Surathkal, India 575 025*
___
Mailman-Developers mailing list
Mailman-Developers@python.org

Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-12 Thread Sreyanth
And this is my another idea, which I am interested to work on

4. My own project idea: Mining the list logs and recognize interesting
patterns for better enhancements (the admin need not have data mining
experience)

​We can actually have this integrated to the admin console where the logs
can be accessed, at the same time, some interesting patterns can shown,
along with stats and all (just a basic idea, need to work on this more).
Depending on the detected patterns, the admin may want to change some
settings! Given my experience with IR and Django, I feel this is a
potential GSoC project!

Any suggestions?​



On Fri, Apr 12, 2013 at 8:47 PM, Sreyanth sreya...@gmail.com wrote:

 Hi all! Thank you very much for awesome discussion here!

 On Fri, Apr 12, 2013 at 1:22 AM, Terri Oda te...@zone12.com wrote:

  On 13-04-11 10:44 AM, Stephen J. Turnbull wrote:

 1.  Mailman is the wrong place to do filtering.  It's equally
 effective, normally covers more messages, and is somewhat more
 efficient in resource usage to do it at the MTA.
 2.  Any new algorithms **should** be made available at the MTA level
 where they can be best put to use by more people.  This implies
 something that either plugs into existing filters (such as
 spamassassin) or MTAs (ie, milters) rather than a Handler.
 3.  Adapting existing filters is generally pretty trivial: you write a
 10-line custom Handler that pipes it to an external process.  This
 isn't big enough for a GSoC project.
 4.  To the extent that new algorithms are involved, I have doubts that
 Mailman mentors have the kind of expertise needed to really help
 with such a project (I could be wrong, but I certainly don't know
 much about that kind of text processing, and I don't know that
 anybody else in Mailman has expertise in it).

 I agree.​​


 Writing individual pipelines may be trivial, but making a user interface
 for managing said pipelines is non-trivial.  Right now, our pipeline
 management interface is there's a text box in postorius that lets you
 choose a pipeline.  It's not even a dropdown, and you may be screwed if you
 make a typo which is obviously not how I want it when we release. ;)

 I see a potential project timeline going something like this:

 A. make a set of custom Mailman 3 Handlers for some well-known existing
 anti-spam/anti-malware software.  (Maybe 2-3 weeks of work here, finding
 2-4 reasonable pieces of software, setting them up, writing the handlers,
 and testing them)

 B. make an interface in Postorius so list admins can
 enable/disable/reorder these and any whitelisting happening within
 mailman.  This should involve making an interface in Postorius that gives
 admins the ability to change the Pipeline being used, and will likely
 involve a small amount of user testing to make sure said interface doesn't
 have risk of disastrous results if the administrator does the wrong thing.
 (Another 3-4 weeks of work including user testing, unit tests, and
 documentation)

 C. Figure out how to set up some sort of packager that can install
 handlers + antispam software so that the site admin has an easy way to set
 these up if requested. (Another 3-4 weeks of work, including testing any
 scripts on a few different OSes and extensive documentation)

 D. If there's any time leftover, implement some clever new filter (and
 appropriate Handler) that makes use of the list information itself (e.g.
 subscriber list, archives, etc.) to make better spam decisions. (at this
 point, you've got maybe 2 weeks left in the GSoC timeline)

 This really looks great! Almost what I actually expected from a project
 like this.
 But, like Stephen and Barry pointed out, I am unsure as to how far this
 comes under GSoC's purview.
 ​​


 I think that constitutes enough useful-to-mailman work to justify the
 google funds, gets us some customizable spam filtering (which as you say,
 is a frequently requested feature), but doesn't turn us into something
 we're not.  That's why anti-spam made this year's gsoc list even though
 we've always said do it in the MTA and I'm not about to change that
 policy in general.

 Do feel free to disagree with me, of course, Stephen. Or complain that
 I'm using the lure of antispam to get someone solve my user interface for
 pipelines problem, which I totally am. ;)

  Terri

 Thanks for such a great timeline Terri. I dont have issues with this. As
 Stephen and Barry said, I even liked the idea of having a MILTER interfaced
 at LMTP level.

 On a overall positive note, I am quite convinced that giving the admin of
 the list with great flexible options to choose from (and as Barry pointed
 out, why should everything be exposed to the admin via Postorius?, which
 may not be of the admin's interest! ). I believe this could be make a nice
 GSoC project, but with many spam filters which people are already
 acquainted with, I am not sure how far people tend to use this feature.

 Also, I would like to 

Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-12 Thread Patrick Ben Koetter
* Barry Warsaw ba...@list.org:
 On Apr 12, 2013, at 08:28 AM, Patrick Ben Koetter wrote:
 
 I think it would be real nice to have a MILTER interface at LMTP server level
 to allow mail modification as required. Mailman runs in large environments 
 and
 all the 'large organizations' I have worked asked my team and me to customize
 how mail is processed. MILTER is a great interface to modify mail.
 
 Do you mean a hook in Mailman's LMTP server process?  I thought about that in

Yes, I mean to hook MILTER capability into Mailman's LMTP server process.

 my previous message but decided not to mention it because it's not clear to me
 how performant Mailman's current smtpd-based (read: async) LMTP server is.

It's not clear to me either, but now that you made me think about it I begin
to ask myself how fast is fast enough and I also ask myself are we dealing
with a bogey (had to look this up. hope it fits) or are trying to address a
reasonable bottleneck. (I've experienced quite a few problematic situations
in mail transport which turned out to be more driven by myth and oral history
than by vested knowledge).

I agree we should measure, just in order not to speculate, but let me send
some thoughts ahead before we take out to test performance:

- Input/output ratio on a mailing list system is 1:n. Performance requirements
  on the receiving side should be the least to worry about. 

- In most usage scenarios that come to my mind companies run an MLM as a
  supplement to their 'regular' mail system. Only a minor ratio of mail that
  enters the mail system is routed forward to the MLM (here: MM3 LMTP server).

- At the moment (MM2) mail enters Mailman via a script that is called. Scripts
  are _a lot_ slower than a server process. My understanding is MM3 will have
  an LMTP server process. Any site that switches to MM3 should experience a
  performance boost on the receiving side.

It seems to me most people will be off fine. Unfortunately I think most
people will not need to use a MILTER, too.

What characterizes the remaining group:

- They run sites dedicated solely to mailing lists.

- They need special filtering (read: MILTER and other methods).

- They split load via clusters.

- They have their own development teams to customize and optimize software as
  required

 What I mean is, I'm not sure how much additional work we want the LMTP server
 to do.

How much should it be able to do at all? Do you collect and log statistics at
the moment? Personally I like the delays=0.04/0.01/0.05/0.1 entry in
Postfix's log. Quote from postconf(5):

   The format of the delays=a/b/c/d logging is as follows:

   ·  a = time from message arrival to last active queue entry

   ·  b = time from last active queue entry to connection setup

   ·  c = time in connection setup, including DNS, EHLO and STARTTLS

   ·  d = time in message transmission

   -- $ man 5 postconf | less +/delay_logging_resolution_limit


 It would be cool if someone did some performance testing of the LMTP
 implementation, and it would be cool if someone tried to add some hooks into
 that server.  It might also be interesting to look into alternative
 implementations.  Another reason to push for getting Mailman 3 onto Python 3
 would be the ability to leverage Guido's Tulip work for better async IO
 performance.

I'm short on time to do performance testing myself, but I'll forward the
request to my team members since we are doing tests at the moment anyway.
Maybe someone finds time to squeeze LMTP server testing in.

My first idea would be to use either Postfix smtp-source (multi-threaded
SMTP/LMTP test generator) or swaks (Swiss Army Knife for SMTP)
http://www.jetmore.org/john/code/swaks/ and create a wrapper around it that
produces the load.


 Has anyone ever mentioned SNMP as a feature for Mailman?
 
 Nope, but that would be interesting too.

We (sys4) will contribute the MIB and monitoring server during development, if
someone takes onto the programming.

p@rick

-- 
[*] sys4 AG
 
http://sys4.de, +49 (89) 30 90 46 64
Franziskanerstraße 15, 81669 München
 
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
Aufsichtsratsvorsitzender: Joerg Heidrich
 
___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9

[Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-11 Thread Stephen J. Turnbull
Sreyanth writes:

  3. Anti-spam / anti-abuse in Mailman.

A couple of people have mentioned anti-spam, and it's a frequently
requested feature.  Nevertheless, I don't think we should spend Google
money and mentor time on it.

1.  Mailman is the wrong place to do filtering.  It's equally
effective, normally covers more messages, and is somewhat more
efficient in resource usage to do it at the MTA.
2.  Any new algorithms *should* be made available at the MTA level
where they can be best put to use by more people.  This implies
something that either plugs into existing filters (such as
spamassassin) or MTAs (ie, milters) rather than a Handler.
3.  Adapting existing filters is generally pretty trivial: you write a
10-line custom Handler that pipes it to an external process.  This
isn't big enough for a GSoC project.
4.  To the extent that new algorithms are involved, I have doubts that
Mailman mentors have the kind of expertise needed to really help
with such a project (I could be wrong, but I certainly don't know
much about that kind of text processing, and I don't know that
anybody else in Mailman has expertise in it).

On the other hand, I don't know which project in GSoC would be a
better place for it.  It's possible to argue that Mailman is a
reasonable place for it, but IMHO we probably shouldn't.

Regarding anti-abuse, we would like to do something about problems
like backscatter.  However, I have to wonder how much *code* (vs
*specification* and *design*) is needed for those problems.  If the
project is really spec-heavy, it's probably not really what Google has
in mind (based on comments on the mentors' list, not on any official
Google pronouncements, though).
___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


Re: [Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-11 Thread Terri Oda

On 13-04-11 10:44 AM, Stephen J. Turnbull wrote:

1.  Mailman is the wrong place to do filtering.  It's equally
 effective, normally covers more messages, and is somewhat more
 efficient in resource usage to do it at the MTA.
2.  Any new algorithms*should*  be made available at the MTA level
 where they can be best put to use by more people.  This implies
 something that either plugs into existing filters (such as
 spamassassin) or MTAs (ie, milters) rather than a Handler.
3.  Adapting existing filters is generally pretty trivial: you write a
 10-line custom Handler that pipes it to an external process.  This
 isn't big enough for a GSoC project.
4.  To the extent that new algorithms are involved, I have doubts that
 Mailman mentors have the kind of expertise needed to really help
 with such a project (I could be wrong, but I certainly don't know
 much about that kind of text processing, and I don't know that
 anybody else in Mailman has expertise in it).


Writing individual pipelines may be trivial, but making a user interface 
for managing said pipelines is non-trivial.  Right now, our pipeline 
management interface is there's a text box in postorius that lets you 
choose a pipeline.  It's not even a dropdown, and you may be screwed if 
you make a typo which is obviously not how I want it when we release. ;)


I see a potential project timeline going something like this:

A. make a set of custom Mailman 3 Handlers for some well-known existing 
anti-spam/anti-malware software.  (Maybe 2-3 weeks of work here, finding 
2-4 reasonable pieces of software, setting them up, writing the 
handlers, and testing them)


B. make an interface in Postorius so list admins can 
enable/disable/reorder these and any whitelisting happening within 
mailman.  This should involve making an interface in Postorius that 
gives admins the ability to change the Pipeline being used, and will 
likely involve a small amount of user testing to make sure said 
interface doesn't have risk of disastrous results if the administrator 
does the wrong thing.  (Another 3-4 weeks of work including user 
testing, unit tests, and documentation)


C. Figure out how to set up some sort of packager that can install 
handlers + antispam software so that the site admin has an easy way to 
set these up if requested. (Another 3-4 weeks of work, including testing 
any scripts on a few different OSes and extensive documentation)


D. If there's any time leftover, implement some clever new filter (and 
appropriate Handler) that makes use of the list information itself (e.g. 
subscriber list, archives, etc.) to make better spam decisions. (at this 
point, you've got maybe 2 weeks left in the GSoC timeline)



I think that constitutes enough useful-to-mailman work to justify the 
google funds, gets us some customizable spam filtering (which as you 
say, is a frequently requested feature), but doesn't turn us into 
something we're not.  That's why anti-spam made this year's gsoc list 
even though we've always said do it in the MTA and I'm not about to 
change that policy in general.


Do feel free to disagree with me, of course, Stephen. Or complain that 
I'm using the lure of antispam to get someone solve my user interface 
for pipelines problem, which I totally am. ;)


 Terri

___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


[Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

2013-04-04 Thread Sreyanth
Hi all!

I am interested to participate in GSoC this year and would like to choose
GNU Mailman for it. I have gone through the proposed ideas and I would like
people to tell if they are feasible for one summer!

1. Boilerplate stripper AND Better content-filtering / handling error
messages.
2. No-logging mode.
3. Anti-spam / anti-abuse in Mailman.
4. My own project idea: Mining the list logs and recognize interesting
patterns for better enhancements (the admin need not have data mining
experience)

By far, I am looking to duplicate bugs on my comp and am aiming at writing
a patch or two before the student deadline, so that I have much more time
to work on my application.

# Bragging mode ON. You may want to skip this.
And, let me introduce myself. I am Sreyantha Chary (lets make it Sreyanth),
a 3rd year undergrad majoring in Computer Engineering at the National
Institute of Technology Karnataka. I wont consider myself an expert in
python programming, but yeah, I am good enough to work on intermediate
projects. I have used Django for my university projects and loved it.
Research wise, I am into Machine Learning, Information Retrieval and Data
Mining. And coding wise, I try crazy stuff, sometimes just to check if I
remember anything from the documentation!
​# Bragging mode OFF.​

People, please let me know your take on the projects I am interested in.
How far is my proposed idea feasible?

Thanks in advance. Thanks for your time!

*
​Thanks and regards
*
*
*
*Mora Sreyantha Chary*
*Computer Engineering '14*
*National Institute of Technology Karnataka*
*Surathkal, India 575 025*
___
Mailman-Developers mailing list
Mailman-Developers@python.org
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9