Noel J. Bergman wrote:
Stefano wrote:

testing 2 times 6 multihomed servers and do this things 15 times by
default. In our default that single mail would result in 6*2*15 total
attempts (180 attempts) each one keeping a thread busy for 10 minutes
(1800 minutes, more than a whole thread day).

That does not match the default delivery schedule.  6IP*2Server*10min is two
hours, and then there is a retry delay before we try again.  After the 4th
retry, the interval would be 3 hours, and the remaining are 6 hours up to 25
attempts.  So the worse case should have the thread tied up for 2 hours
(which I agree is not acceptable), and there should be large blocks of hours
where e-mail delivery for that message is not attempted.

Yes, 2 hours each attempt. 15 attempts are 30 hours of thread time per each mail destinated to that domain. Yeah, *only* 2 hours before the other mails are processed, but take this scenario then:

10 mails for the "bad" host, 1 for the good host in this order.
James try the first bad mail for 2 hours, then the second for 2 hours, then the third.... so on, for 20 hours,.. then it try the good one and deliver it (20 hours later!)

So to increase the probability to deliver a mail soon to a bad host we delay the delivery time to the good host.

If this is not working as described, it is a bug.  Not a design flaw.

It is working as described and imho is not a good thing. I would really prefer a better default (test all MX but only 1 IP per mx, much lower timeouts 30seconds on connect +3 minutes on session, 5 remote delivery threads).

Btw I also patched RemoteDelivery to have a max MX servers to test configurable: I use 2 as my default.

Few domains uses a lot of different MX domains and this would return to the "bad" scenario we described.

I also think that 15 attempts to deliver a single mail is too much as a default.

Maybe we should put in our config 2 different RemoteDelivery: one "best effort" with good performance but less reliable and one with worst performance but better reliability.

1800 thread minutes for a single mail as worst case default is not
acceptable to me.

With a 3 minute timeout, the worst case for your scenario should be 36
minutes before we reschedule the e-mail for later delivery.

This would be too much anyway if you are sending 100000 mails and even if 1% of that mails are for the "bad" host.

Maybe we should change our accept order policy: now we sort by last_updated. Maybe we should change it to sort by state then last_updated (getting first all the messages not in ERROR and then the one in ERROR). This way at least a first attempt should always be done with a grater priority than retries.

Stefano


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to