Looking further into mupdate this morning (see my earlier message today), I've discovered something else that looks like an issue.
It looks like a good deal of the calling code that knows about mupdate relies on basic logic like: - do a local database lookup - if the mailbox we're looking for doesn't exist, call kick_mupdate() to ensure that our local database is fresh, then do the local database lookup again. It appears that the thought behind kick_mupdate() is that it was written to exploit the fact that mupdate will not answer slave requests (on the local unix socket) while a sync operation is in process. I notice one fundamental problem with this, and one area for improvement. The fundamental problem is that kick_mupdate() returns void. If a mupdate slave is not currently listening on its local AF_UNIX socket (like when it disconnects because it loses contact with the mupdate master) kick_mupdate() will attempt to connect, fail, log an error and return. Most of the calling code I looked at assumes that if kick_mupdate() returns, the local database is current. In this case, that may or may not be true. At the very least, I think kick_mupdate() needs to be updated to be able to return a status so any calling code can make a more informed decision about what to do. The area for improvement is that kick_mupdate() never times out its read() from mupdate's local unix socket. Wouldn't it be a good idea to wrap this read() in an alarm() call and return failure? Before I start working on a patch, can someone who is familiar with this code let me know if I'm way off base or not? Thanks, Dave -- Dave McMurtrie, SPE Email Systems Team Leader Carnegie Mellon University, Computing Services