Sam Varshavchik wrote:

Aleksander Adamowski writes:

However, in my case, there's also a problem with LDAP dying (don't ask...) - I run it in an infinite loop, but to come back after a crash it requires about 1,5 seconds. The problem has been reported to the OpenLDAP team (http://www.openldap.org/its/index.cgi?findid=4390), but from my experience OpenLDAP always has had problems under high workloads.

Would it be difficult to modify this patch (and the one against authlib from the "authldap failing randomly") so that courier sleeps a configured amount of time before retrying?


Not too difficult at all, but I think that falls into the "custom kludge needed for your specific situation only" category.

Possibly.

But let's discuss it if You don't mind:

1) Courier supports various mechanisms for authentication and retrieving account metadata - it can use standard UNIX, LDAP, SQL RDBMS-es. UNIX mechanisms are rather for small scale installations (a single server), others are for mid-to-large installations. 2) When the mail system is scaled to mid- or large-scale installs, it usually gets more and more distributed: There's usually a central authentication database, replicated over 2 or more machines. The mail system becomes more complex and the probability of temporary connection problems between its components goes up. Those temporary connection problems are often short-term, but are inevitable. Various causes are possible. My case with OpenLDAP crashing is among them, but not the only one.

One can imagine several types of those problems, e.g.:

* a change to access permissions to the OpenLDAP directory requires restart of slapd (silly, but true :( ), which causes a 2-4 seconds LDAP downtime * during some recabling work in the server room an Ethernet cable is pluged to a different port on a switch; the switch needs to learn the new port-macaddr mapping; during this time the connection to the authentication server gets interrupted for a couple of seconds; this results in TCP connection dropped and loss of conenctivity for ~ 7 seconds * an admin implements a new set of rules for the packet filter, but during the reloading phase that filter implementation causes all TCP connections to be interrupted and packets to be rejected (e.g. packet filters on some BSD systems do this)


It's true that every single problem from those listed can be avoided when following proper procedures. But in the real world, with sufficiently complex infrastructure, one cannot possibly avoid all possible problems. In each of these cases the mail system would be seen as more robust if it had tolerated those connectivity problems as long as they had been short enough (within configured timeout/number of retries).

Currently for each such problem, courieresmtp generates temporary errors (4xx), and courierimap generates permanent errors. Both are quite disturbing to the local users (those, who have mailboxes on the server), since for an end user' mail client a temporary SMTP error is permanent (MUA usually doesn't have a mail queue like MTAs).

Of course I may be wrong - it's just an opinion for You to consider.

--
Best Regards,
   Aleksander Adamowski
       GG#: 274614
ICQ UIN: 19780575 http://olo.ab.altkom.pl



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to