Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
Why is filterctl involved in service courier restart at all? As far as I can tell, courier's sysvinit script doesn't invoke filterctl and neither does courierfilter. What am I missing? Yes, that was a misnomer. filterctl enables or disables individual filters; and courierfilter is the one that starts and stops all enabled filters. Yup; typo on my part. I was merely showing I tried the unreasonable things too (courierfilter start), and they failed too. What I was seeing: failure for pythonfilter to stop during a ‘service courier restart’, specifically with the timeout alarm, would result in it failing to start on the way back up, which then meant spam and virus emails were getting accepted going forward, until something / someone manually corrected it. That’s a pretty bad failure scenario for us. -J -- New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. vanity: www.gigenet.com ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
On Sun 11/Jan/2015 22:36:59 +0100 Gordon Messmer wrote: On 01/09/2015 08:18 AM, Alessandro Vesely wrote: To kill by pid is going to be difficult for forked filters. I issue a call kill(0, SIGTERM) when the pipe is closed, but I had previously called setsid(). I'll note that the man page for filterctl says: When its standard input is closed the mail filter should stop accepting new connections and wait for any existing connections to be closed, prior to exiting. Issuing that signal after a short sleep is a sort of tensely waiting. It should be enough to hurry up a process that seems to be frozen. SIGTERM is not enough to definitely kill a runaway filter, but issuing SIGKILL to its process group would kill the issuing process too. Since courierfilter keeps track of each filter's pid, it has a better opportunity to supervise naughty filters. I'm not sure, off the top of my head, why it would be a problem for a filter to continue waiting after the timeout. One has to be able to tear down the MTA in a reasonable time. The need to run killall introduces superfluous coupling between those scripts and installed filters; for example, it may prevent users from running multiple Courier instances. Ale -- -- New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. vanity: www.gigenet.com ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
On Sat 10/Jan/2015 00:08:04 +0100 Sam Varshavchik wrote: Alessandro Vesely writes: Currently, the shutdown code just gives up, after a timeout, in this manner. I do agree that an attempt should be made to kill all processes, after a reasonable timeout, so it's something that I need to look at. To kill by pid is going to be difficult for forked filters. I issue a call kill(0, SIGTERM) when the pipe is closed, but I had previously called setsid(). SIGTERM (15) should suffice to quickly exit; SIGKILL (9) is what I'd call overkill. It seems several users issue `killall -9 ...`. It shouldn't be needed, and I'd expect some kind of bug report if runaway children refuse to exit, please. Well, there are various ways to make sure that child processes get SIGKILLed. The traditional way to do this is with process groups. Err... I call setsid() because it's what the daemon() function usually does, albeit filters run by courierfilter don't fork. Probably, setpgid() or setpgrp() are better choices. (Let me recall that a session consists of one or more program groups.) Probably the best thing to do would be to kill all the processes, but exit with a non-zero exit code. You're talking about courierfilter behavior, right? Filters should kill all of their child processes or threads when fd0 is closed. However, if a filter fails to do so, it deserves a SIGKILL. Logging filtername is probably useful. If a filter forked children, it may be worth to attempt to kill them too. For example, something like: pid_t p = pp-p; if (getpgid(p) != getpgid(getpid())) p = -p; kill(p, SIGKILL); Alternatively, a new group can be created by courierfilter right after forking each new filter. That way, the if above is always true unless a filter maliciously sets its group id back to courierfilter's one. systemd takes this one step further, and puts all systemd-started processes in a container. On Fedora, a shutdown is going to kill any stuck filter processes, after filterctl stop bails out; so this is a non-issue on Fedora(*). Hm... filterctl may be invoked by hand, no? (*) This is not intended to be an endorsement of systemd. Thank goodness :-) Ale -- -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
Gordon Messmer writes: On 01/09/2015 08:18 AM, Alessandro Vesely wrote: To kill by pid is going to be difficult for forked filters. I issue a call kill(0, SIGTERM) when the pipe is closed, but I had previously called setsid(). I'll note that the man page for filterctl says: When its standard input is closed the mail filter should stop accepting new connections and wait for any existing connections to be closed, prior to exiting. I'm not sure, off the top of my head, why it would be a problem for a filter to continue waiting after the timeout. For that matter, I'm not sure I understand the original question. It said: After start, pythonfilter is not started — 'filterctl start pythonfilter' fails to bring it up with this: filterctl start pythonfilter ln: creating symbolic link `/etc/courier/filters/active/pythonfilter' to `/usr/lib/courier/libexec/filters/pythonfilter': File exists Why is filterctl involved in service courier restart at all? As far as I can tell, courier's sysvinit script doesn't invoke filterctl and neither does courierfilter. What am I missing? Yes, that was a misnomer. filterctl enables or disables individual filters; and courierfilter is the one that starts and stops all enabled filters. pgpQLl6mAYTm6.pgp Description: PGP signature -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
On 01/09/2015 08:18 AM, Alessandro Vesely wrote: To kill by pid is going to be difficult for forked filters. I issue a call kill(0, SIGTERM) when the pipe is closed, but I had previously called setsid(). I'll note that the man page for filterctl says: When its standard input is closed the mail filter should stop accepting new connections and wait for any existing connections to be closed, prior to exiting. I'm not sure, off the top of my head, why it would be a problem for a filter to continue waiting after the timeout. For that matter, I'm not sure I understand the original question. It said: After start, pythonfilter is not started — 'filterctl start pythonfilter' fails to bring it up with this: filterctl start pythonfilter ln: creating symbolic link `/etc/courier/filters/active/pythonfilter' to `/usr/lib/courier/libexec/filters/pythonfilter': File exists Why is filterctl involved in service courier restart at all? As far as I can tell, courier's sysvinit script doesn't invoke filterctl and neither does courierfilter. What am I missing? -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
Thanks, Gordon! I don’t know how I missed that. (Well, I do, it was 14 years ago when I wrote this script. Sigh.) -J On Jan 9, 2015, at 6:18 PM, Gordon Messmer gordon.mess...@gmail.com wrote: On 01/09/2015 01:06 PM, Jeff Potter wrote: We have to restart courier many times a day to pick up changes to files under the aliases dir and esmtpacceptmailfor You shouldn't have to restart anything for those two. The man page for courier states, Unless otherwise specified, you must run courier restart for any changes to these files to take effect. Both man pages for makealiases and makeacceptmailfor state that changes take effect immediately. If you're changing one of the other files, one which does require restarting courier, you can use courier restart rather than service courier restart to restart only courierd. Restarting courierfilter and the other processes isn't necessary. -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
Alessandro Vesely writes: Currently, the shutdown code just gives up, after a timeout, in this manner. I do agree that an attempt should be made to kill all processes, after a reasonable timeout, so it's something that I need to look at. To kill by pid is going to be difficult for forked filters. I issue a call kill(0, SIGTERM) when the pipe is closed, but I had previously called setsid(). SIGTERM (15) should suffice to quickly exit; SIGKILL (9) is what I'd call overkill. It seems several users issue `killall -9 ...`. It shouldn't be needed, and I'd expect some kind of bug report if runaway children refuse to exit, please. Well, there are various ways to make sure that child processes get SIGKILLed. The traditional way to do this is with process groups. Probably the best thing to do would be to kill all the processes, but exit with a non-zero exit code. systemd takes this one step further, and puts all systemd-started processes in a container. On Fedora, a shutdown is going to kill any stuck filter processes, after filterctl stop bails out; so this is a non-issue on Fedora(*). (*) This is not intended to be an endorsement of systemd. pgpjtOIx7pw2e.pgp Description: PGP signature -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
On 01/09/2015 01:06 PM, Jeff Potter wrote: We have to restart courier many times a day to pick up changes to files under the aliases dir and esmtpacceptmailfor You shouldn't have to restart anything for those two. The man page for courier states, Unless otherwise specified, you must run courier restart for any changes to these files to take effect. Both man pages for makealiases and makeacceptmailfor state that changes take effect immediately. If you're changing one of the other files, one which does require restarting courier, you can use courier restart rather than service courier restart to restart only courierd. Restarting courierfilter and the other processes isn't necessary. -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
On Thu 08/Jan/2015 23:51:56 +0100 Jeff Potter wrote: 4. After start, pythonfilter is not started — 'filterctl start pythonfilter' fails to bring it up with this: filterctl start pythonfilter ln: creating symbolic link `/etc/courier/filters/active/pythonfilter' to `/usr/lib/courier/libexec/filters/pythonfilter': File exists filterctl is used once to install a filter. Thereafter, it can be used again to uninstall it. You need to stop it before you can start it again. The difference vs. courierfilter is that the latter makes Courier bounce messages with 432 Mail filters temporarily unavailable. On Fri 09/Jan/2015 01:25:16 +0100 Sam Varshavchik wrote: You're not really missing anything. It's really the filter's responsibility to wind down its business, once it gets the signal to do so. For my part, I just fixed a race condition bug. It was the opposite of what Jeff complains; that is, sometimes the filter terminated without being asked to, thus forcing Courier into 432-mode. http://www.tana.it/sw/avfilter/ http://www.tana.it/sw/zdkimfilter/ Currently, the shutdown code just gives up, after a timeout, in this manner. I do agree that an attempt should be made to kill all processes, after a reasonable timeout, so it's something that I need to look at. To kill by pid is going to be difficult for forked filters. I issue a call kill(0, SIGTERM) when the pipe is closed, but I had previously called setsid(). SIGTERM (15) should suffice to quickly exit; SIGKILL (9) is what I'd call overkill. It seems several users issue `killall -9 ...`. It shouldn't be needed, and I'd expect some kind of bug report if runaway children refuse to exit, please. I suggest to amend courierfilter's man page, where it says: All mail filters also inherit a pipe on standard input, and must terminate when the pipe is closed. Mail filters must simultaneously listen for new connections on the mail filter socket, and for their standard input to close. It could say something more pressing, for example like so: All mail filters also inherit a pipe on standard input, and must terminate when the pipe is closed, possibly aborting the current message --Courier will issue a temporary 4xx bounce in that case. Mail filters must simultaneously listen for new connections on the mail filter socket, and for their standard input to close. Ale -- -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net ___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users
Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping
Jeff Potter writes: daemon 32271 0.0 0.0 3832 328 ?S17:35 0:00 /usr/lib/courier/sbin/courierfilter start daemon 32273 0.0 0.0 3800 512 ?S17:35 0:00 /usr/sbin/courierlogger courierfilter daemon 32274 0.0 0.0 183444 9188 ?S17:35 0:00 /usr/bin/python /etc/courier/filters/active/pythonfilter I have to kill -9 the courierfilter process, and then restart courier and stuff flushes out fine. What am I missing? I would think that “service courier stop” should definitely nuke any process that it started up, but I know there’s a boundary between filters and courier. You're not really missing anything. It's really the filter's responsibility to wind down its business, once it gets the signal to do so. Currently, the shutdown code just gives up, after a timeout, in this manner. I do agree that an attempt should be made to kill all processes, after a reasonable timeout, so it's something that I need to look at. pgp2eAP94eghV.pgp Description: PGP signature -- Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net___ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users