Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-12 Thread Jeff Potter
 Why is filterctl involved in service courier restart at all?  As far
 as I can tell, courier's sysvinit script doesn't invoke filterctl and
 neither does courierfilter.  What am I missing?
 
 Yes, that was a misnomer. filterctl enables or disables individual filters; 
 and courierfilter is the one that starts and stops all enabled filters.

Yup; typo on my part. I was merely showing I tried the unreasonable things too 
(courierfilter start), and they failed too.

What I was seeing: failure for pythonfilter to stop during a ‘service courier 
restart’, specifically with the timeout alarm, would result in it failing to 
start on the way back up, which then meant spam and virus emails were getting 
accepted going forward, until something / someone manually corrected it. That’s 
a pretty bad failure scenario for us.

-J
--
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
vanity: www.gigenet.com
___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-12 Thread Alessandro Vesely
On Sun 11/Jan/2015 22:36:59 +0100 Gordon Messmer wrote: 
 On 01/09/2015 08:18 AM, Alessandro Vesely wrote:
 To kill by pid is going to be difficult for forked filters.  I issue a call
 kill(0, SIGTERM) when the pipe is closed, but I had previously called 
 setsid().
 
 I'll note that the man page for filterctl says:
 When its standard input is closed the mail filter should stop accepting 
 new connections and wait for any existing connections to be closed, 
 prior to exiting.

Issuing that signal after a short sleep is a sort of tensely waiting.  It
should be enough to hurry up a process that seems to be frozen.

SIGTERM is not enough to definitely kill a runaway filter, but issuing SIGKILL
to its process group would kill the issuing process too.  Since courierfilter
keeps track of each filter's pid, it has a better opportunity to supervise
naughty filters.

 I'm not sure, off the top of my head, why it would be a problem for a 
 filter to continue waiting after the timeout.

One has to be able to tear down the MTA in a reasonable time.  The need to run
killall introduces superfluous coupling between those scripts and installed
filters; for example, it may prevent users from running multiple Courier 
instances.

Ale
-- 


































--
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
vanity: www.gigenet.com
___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-11 Thread Alessandro Vesely
On Sat 10/Jan/2015 00:08:04 +0100 Sam Varshavchik wrote: 
 Alessandro Vesely writes:
 
 Currently, the shutdown code just gives up, after a timeout, in this 
 manner. I
 do agree that an attempt should be made to kill all processes, after a
 reasonable timeout, so it's something that I need to look at.

 To kill by pid is going to be difficult for forked filters.  I issue a call
 kill(0, SIGTERM) when the pipe is closed, but I had previously called 
 setsid().
 SIGTERM (15) should suffice to quickly exit; SIGKILL (9) is what I'd call
 overkill.  It seems several users issue `killall -9 ...`.  It shouldn't be
 needed, and I'd expect some kind of bug report if runaway children refuse to
 exit, please.
 
 Well, there are various ways to make sure that child processes get SIGKILLed.
 The traditional way to do this is with process groups.

Err... I call setsid() because it's what the daemon() function usually does,
albeit filters run by courierfilter don't fork.  Probably, setpgid() or
setpgrp() are better choices.  (Let me recall that a session consists of one or
more program groups.)

 Probably the best thing to do would be to kill all the processes, but exit 
 with
 a non-zero exit code.

You're talking about courierfilter behavior, right?  Filters should kill all of
their child processes or threads when fd0 is closed.  However, if a filter
fails to do so, it deserves a SIGKILL.  Logging filtername is probably useful.

If a filter forked children, it may be worth to attempt to kill them too.  For
example, something like:

   pid_t p = pp-p;
   if (getpgid(p) != getpgid(getpid()))
  p = -p;
   kill(p, SIGKILL);

Alternatively, a new group can be created by courierfilter right after forking
each new filter.  That way, the if above is always true unless a filter
maliciously sets its group id back to courierfilter's one.

 systemd takes this one step further, and puts all systemd-started processes in
 a container. On Fedora, a shutdown is going to kill any stuck filter 
 processes,
 after filterctl stop bails out; so this is a non-issue on Fedora(*).

Hm...  filterctl may be invoked by hand, no?

 (*) This is not intended to be an endorsement of systemd.

Thank goodness :-)

Ale
-- 











































--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-11 Thread Sam Varshavchik

Gordon Messmer writes:


On 01/09/2015 08:18 AM, Alessandro Vesely wrote:
 To kill by pid is going to be difficult for forked filters.  I issue a call
 kill(0, SIGTERM) when the pipe is closed, but I had previously called  
setsid().


I'll note that the man page for filterctl says:
When its standard input is closed the mail filter should stop accepting
new connections and wait for any existing connections to be closed,
prior to exiting.

I'm not sure, off the top of my head, why it would be a problem for a
filter to continue waiting after the timeout.

For that matter, I'm not sure I understand the original question.  It said:
After start, pythonfilter is not started — 'filterctl start
pythonfilter' fails to bring it up with this:
   filterctl start pythonfilter
   ln: creating symbolic link `/etc/courier/filters/active/pythonfilter'
to `/usr/lib/courier/libexec/filters/pythonfilter': File exists

Why is filterctl involved in service courier restart at all?  As far
as I can tell, courier's sysvinit script doesn't invoke filterctl and
neither does courierfilter.  What am I missing?


Yes, that was a misnomer. filterctl enables or disables individual filters;  
and courierfilter is the one that starts and stops all enabled filters.




pgpQLl6mAYTm6.pgp
Description: PGP signature
--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-11 Thread Gordon Messmer
On 01/09/2015 08:18 AM, Alessandro Vesely wrote:
 To kill by pid is going to be difficult for forked filters.  I issue a call
 kill(0, SIGTERM) when the pipe is closed, but I had previously called 
 setsid().

I'll note that the man page for filterctl says:
When its standard input is closed the mail filter should stop accepting 
new connections and wait for any existing connections to be closed, 
prior to exiting.

I'm not sure, off the top of my head, why it would be a problem for a 
filter to continue waiting after the timeout.

For that matter, I'm not sure I understand the original question.  It said:
After start, pythonfilter is not started — 'filterctl start 
pythonfilter' fails to bring it up with this:
   filterctl start pythonfilter
   ln: creating symbolic link `/etc/courier/filters/active/pythonfilter' 
to `/usr/lib/courier/libexec/filters/pythonfilter': File exists

Why is filterctl involved in service courier restart at all?  As far 
as I can tell, courier's sysvinit script doesn't invoke filterctl and 
neither does courierfilter.  What am I missing?


--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-10 Thread Jeff Potter

Thanks, Gordon! I don’t know how I missed that. (Well, I do, it was 14 years 
ago when I wrote this script. Sigh.)

-J

 On Jan 9, 2015, at 6:18 PM, Gordon Messmer gordon.mess...@gmail.com wrote:
 
 On 01/09/2015 01:06 PM, Jeff Potter wrote:
 We have to restart courier many times a day to pick up changes to
 files under the aliases dir and esmtpacceptmailfor
 
 You shouldn't have to restart anything for those two.
 
 The man page for courier states, Unless otherwise specified, you must 
 run courier restart for any changes to these files to take effect. 
 Both man pages for makealiases and makeacceptmailfor state that changes 
 take effect immediately.
 
 If you're changing one of the other files, one which does require 
 restarting courier, you can use courier restart rather than service 
 courier restart to restart only courierd.  Restarting courierfilter and 
 the other processes isn't necessary.
 
 --
 Dive into the World of Parallel Programming! The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net
 ___
 courier-users mailing list
 courier-users@lists.sourceforge.net
 Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-09 Thread Sam Varshavchik

Alessandro Vesely writes:

 Currently, the shutdown code just gives up, after a timeout, in this  
manner. I

 do agree that an attempt should be made to kill all processes, after a
 reasonable timeout, so it's something that I need to look at.

To kill by pid is going to be difficult for forked filters.  I issue a call
kill(0, SIGTERM) when the pipe is closed, but I had previously called  
setsid().

SIGTERM (15) should suffice to quickly exit; SIGKILL (9) is what I'd call
overkill.  It seems several users issue `killall -9 ...`.  It shouldn't be
needed, and I'd expect some kind of bug report if runaway children refuse to
exit, please.


Well, there are various ways to make sure that child processes get  
SIGKILLed. The traditional way to do this is with process groups.


Probably the best thing to do would be to kill all the processes, but exit  
with a non-zero exit code.


systemd takes this one step further, and puts all systemd-started processes  
in a container. On Fedora, a shutdown is going to kill any stuck filter  
processes, after filterctl stop bails out; so this is a non-issue on  
Fedora(*).



(*) This is not intended to be an endorsement of systemd.





pgpjtOIx7pw2e.pgp
Description: PGP signature
--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-09 Thread Gordon Messmer
On 01/09/2015 01:06 PM, Jeff Potter wrote:
 We have to restart courier many times a day to pick up changes to
 files under the aliases dir and esmtpacceptmailfor

You shouldn't have to restart anything for those two.

The man page for courier states, Unless otherwise specified, you must 
run courier restart for any changes to these files to take effect. 
Both man pages for makealiases and makeacceptmailfor state that changes 
take effect immediately.

If you're changing one of the other files, one which does require 
restarting courier, you can use courier restart rather than service 
courier restart to restart only courierd.  Restarting courierfilter and 
the other processes isn't necessary.

--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-09 Thread Alessandro Vesely
On Thu 08/Jan/2015 23:51:56 +0100 Jeff Potter wrote:
 4. After start, pythonfilter is not started — 'filterctl start pythonfilter' 
 fails to bring it up with this:
   filterctl start pythonfilter
   ln: creating symbolic link `/etc/courier/filters/active/pythonfilter' 
 to `/usr/lib/courier/libexec/filters/pythonfilter': File exists

filterctl is used once to install a filter.  Thereafter, it can be used again
to uninstall it.  You need to stop it before you can start it again.  The
difference vs. courierfilter is that the latter makes Courier bounce messages
with 432 Mail filters temporarily unavailable.

On Fri 09/Jan/2015 01:25:16 +0100 Sam Varshavchik wrote: 
 You're not really missing anything. It's really the filter's responsibility to
 wind down its business, once it gets the signal to do so.

For my part, I just fixed a race condition bug.  It was the opposite of what
Jeff complains; that is, sometimes the filter terminated without being asked
to, thus forcing Courier into 432-mode.

http://www.tana.it/sw/avfilter/
http://www.tana.it/sw/zdkimfilter/

 Currently, the shutdown code just gives up, after a timeout, in this manner. I
 do agree that an attempt should be made to kill all processes, after a
 reasonable timeout, so it's something that I need to look at.

To kill by pid is going to be difficult for forked filters.  I issue a call
kill(0, SIGTERM) when the pipe is closed, but I had previously called setsid().
SIGTERM (15) should suffice to quickly exit; SIGKILL (9) is what I'd call
overkill.  It seems several users issue `killall -9 ...`.  It shouldn't be
needed, and I'd expect some kind of bug report if runaway children refuse to
exit, please.

I suggest to amend courierfilter's man page, where it says:

   All mail filters also inherit a pipe on standard input, and must terminate
   when the pipe is closed.  Mail filters must simultaneously listen for new
   connections on the mail filter socket, and for their standard input to
   close.

It could say something more pressing, for example like so:

   All mail filters also inherit a pipe on standard input, and must terminate
   when the pipe is closed, possibly aborting the current message  --Courier
   will issue a temporary 4xx bounce in that case.  Mail filters must
   simultaneously listen for new connections on the mail filter socket, and
   for their standard input to close.

Ale
-- 



































--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users


Re: [courier-users] Failed filter restarts when restarting courier and filter times out when stopping

2015-01-08 Thread Sam Varshavchik

Jeff Potter writes:

	daemon   32271  0.0  0.0   3832   328 ?S17:35   0:00  
/usr/lib/courier/sbin/courierfilter start
	daemon   32273  0.0  0.0   3800   512 ?S17:35   0:00  
/usr/sbin/courierlogger courierfilter
	daemon   32274  0.0  0.0 183444  9188 ?S17:35   0:00  
/usr/bin/python /etc/courier/filters/active/pythonfilter


I have to kill -9 the courierfilter process, and then restart courier and  
stuff flushes out fine.


What am I missing? I would think that “service courier stop” should  
definitely nuke any process that it started up, but I know there’s a  
boundary between filters and courier.


You're not really missing anything. It's really the filter's responsibility  
to wind down its business, once it gets the signal to do so.


Currently, the shutdown code just gives up, after a timeout, in this manner.  
I do agree that an attempt should be made to kill all processes, after a  
reasonable timeout, so it's something that I need to look at.




pgp2eAP94eghV.pgp
Description: PGP signature
--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net___
courier-users mailing list
courier-users@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users