Hi, For a while now I've had stability problems with ASSP. It has generally been one or two restarts a day. When I upgraded to 16270 I had huge problems with delayed mail, mail not getting through at all and ASSP continually shutting down each time there were thousands of "unable to detect any running worker" spewed into the log files. Yesterday I dropped back to 16256 and at least mail is flowing but so far I've had one ASSP instance shut itself off 8 times today.
I've two instances doing this, both on Ubuntu 14.04. Last night I set debugging on and caught the incident on both servers within half an hour. I've looked through the debug file but there is nothing I can see to indicate any errors. Both servers were handling mail from different senders in the few minutes leading up to the fault. So I looked back through previous threads on the same issue today and saw a Thomas ask what the worker status page showed when it happens. I was wondering how on earth I was going to catch it when it happens and before it reboots then lo and behold whilst I was on the web interface I saw the errors flying past in a tail of the maillog. I went on the web interface and the dot at the top had turned red. I then went on the worker status page and that was all green. Up until now, I have been running 10 workers which is possibly overkill. I had just reduced this instance of ASSP to 5 workers as a test. Status of the workers is: 1,2,3,5 - ThreadGetNewCon with loop age 0s (worker 3 had 1s) 4 - Maillog 10000 - MonitorMainThread (0s) 10001 - schedule waiting (71s) I went back to the main page and the dot had gone back to green but the maillog was still filling with the running worker errors. I refreshed the status and the only changes were: 4 - "wh:0 - write: - wait: 0.005" The time on schedule waiting went up to 96s. Shortly after ASSP Shut down. It is like the main thread and the workers just stop talking to each other. I'd love to crack this and give the latest development version a go because right now I have the annoying issue of an SSL session taking so long that the sending server starts sending it again (This is smtproutes.com not gmail in this case). I've also seen this from Mandrill and got in touch with their support. They explained that it was down to shared spools. One server behind the infrastructure picks up the message and starts delivering it. 10 minutes later it is still there so another server picks it up and starts delivering it. Whichever completes first removes the file and the other servers terminate. All the best, Colin.
------------------------------------------------------------------------------
_______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test