Re: [Mailman-Developers] Regarding Handlers/SMTPDirect.py and "chunkify"

Stefan Förster Mon, 12 May 2008 14:59:38 -0700


Am 12.05.2008 um 23:20 schrieb Mark Sapiro:

I understand what you are saying, but I wonder what the real world
difference would be. As currently written, chunkify returns at most 4
partially filled chunks. Granted, 4 is significantly bigger than one,
but given that the MTA is VERPing the deliveries, it may ultimately
create an outgoing queue entry for each recipient anyway, so the extra
3 on the inbound side doesn't seem that significant (and it might
increase parallelism in the MTA).

First of all, I just noticed that the official code does indeed onlycreate at most 4 partially filled buckets. That's the problem when youhave to jump in for someone else: My SMTPDirect.py contains 26 TLDs.Two thoughts:

1. Even with only four buckets, when we have a real world distributionamongst recipient addresses, this is four times the I/O needed. Theratio get's better with the number of list subscribers growing, but ifthere are less recipients than SMTP_MAX_RCPTS, it's exactly at 1:4.2. Why even split recipients the way it's done now at all? You have toeither add new buckets (add new TLDs) or have all recipients outsidethe hard coded TLDs be thrown into the same bucket. I could understandit if you first created a list of TLDs involved and sorted by those -though I don't know if it's a good idea if you run a really large listand examine all recipients...

I didn't understand what you said about VERPing and outgoing queueentries - surely any MTA will keep track of recipients on a permessage basis? As for parallelism, I think the best way to ensure fastdelivery is to make all target destinations known to the MTA as fastas possible.

Given your 25000 member list, and assuming SMTP_MAX_RCPTS = 500, you
would have at most 54 chunks (and more likely 53 or 52) instead of 50.

In any case, If I were coding this, I would be inclined to not make it
an option, but just to change chunkify so it still grouped, but
continued to fill the last chunk of a group from the next group so
there would be at most one partial chunk.

At the moment, I changed the code to simply return SMTP_MAX_RCPTS perchunk - or all recipients if there are less than that. Hardcoded, notconfigurable. The way it is done now I can't see any real advantages -especially living outside the U.S. Either improve the sortingalgorithm (all TLDs, don't return partial chunks) or make itconfigurable to skip sorting altogether. Or at least that's what Ifeel would be an improvement. Have it default to flat chunking. Itsaves CPU time, I/O operations and gives the MTAs queue manager moretime to do it's job.



Cheers
Stefan
--
Stefan Förster     http://www.incertum.net/     Public Key: 0xBBE2A9E9
Written on OSX. Who ate my ~/.signature?

_______________________________________________
Mailman-Developers mailing list
[email protected]
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp

Re: [Mailman-Developers] Regarding Handlers/SMTPDirect.py and "chunkify"

Reply via email to