Michael Carmack writes:


Now, try attaching strace to the root perlfilter after you start everything, but before you run courierfilter stop.

Therein lies the problem: Whenever I strace it, whether it's before or after the courierfilter stop, it never fails to stop. So I cannot catch anything misbehaving via strace. I have an strace output from perlfilter (http://www.karmak.org/2007/strace.txt), but since it died properly in that instance, I'm not sure there's going to be anything useful in there.

However, I just found out something very important: The problem is *not* a 64-bit build -- it's appears to be SMP, and apparently it's been there for quite some time, at least back to Courier 0.53.2.

All this time I've been assuming the problem had something to do with this new 64-bit server since all of my other servers are 32-bit. But this morning I went and ran some tests on a couple older machines, and without fail the 32-bit SMP always exhibited the same problem. I just never noticed it before because the only time courier is ever shut down is when the entire server is going down.

Here are the specs and results of all the machines I tested on:

Server #1:
   Single-core, 32-bit
   Kernel 2.6.12.6
   Courier 0.53.2
   ** No problem stopping courierfilter **

Server #2:
   Dual-core (SMP), 32-bit (identical to #1 except dual-core proc)
   Kernel 2.6.16.27
   Courier 0.53.2
   ** YES -- problem stopping courierfilter **

Server #3
   Single-core, 32-bit
   Kernel 2.6.15.7
   Courier 0.53.2
   ** No problem stopping courierfilter **

Server #4
   Quad-core (SMP), 64-bit
   Kernel 2.6.20.7
   Courier 0.55.1
   ** YES -- problem stopping courierfilter **


Now, between Server #1 and Server #2, the kernel version was bumped up from 2.6.12 to 2.6.16, but aside from that all the software they run is identical. So the signs are pointing to the problem being with SMP.

Sorry for the misleading diagnosis in the beginning. Does this new information shed light on the problem?

Well, the problem is fairly clear, what's not clear is why it's happening.

Let's try replacing a few lines of simple, clear code with a bunch of complicated stuff that does the same thing, but in a roundabout manner and see what happens. Apply the following patch to see if it makes any difference.


Index: courier/filters/perlfilter/perlfilter.c
===================================================================
RCS file: 
/cvsroot/courier/courier/courier/courier/filters/perlfilter/perlfilter.c,v
retrieving revision 1.6
diff -U3 -r1.6 perlfilter.c
--- courier/filters/perlfilter/perlfilter.c     21 Feb 2002 00:37:01 -0000      
1.6
+++ courier/filters/perlfilter/perlfilter.c     28 Apr 2007 17:09:24 -0000
@@ -9,6 +9,7 @@
 #include       <unistd.h>
 #endif
 #include       <sys/types.h>
+#include       <sys/time.h>
 #include       <sys/stat.h>
 #include       "libfilter/libfilter.h"
 #include       "filtersocketdir.h"
@@ -207,10 +208,30 @@
        signal(SIGCHLD, reap_children);
        lf_init_completed(listen_sock);
 
-       while (read(0, &buffer, 1) != 0)
+
+
+       for (;;)
        {
-               ;
+               fd_set fd0;
+               FD_ZERO(&fd0);
+               FD_SET(0, &fd0);
+
+               if (select(1, &fd0, 0, 0, 0) < 0)
+               {
+                       perror("select");
+                       sleep(5);
+                       continue;
+               }
+
+               if (FD_ISSET(0, &fd0))
+               {
+                       char    buf[16];
+
+                       if (read(0, buf, sizeof(buf)) <= 0)
+                               break;
+               }
        }
+
        wait_restore();
 
        /* Wait for all child processes to terminate */

Attachment: pgp2flvpmOwl4.pgp
Description: PGP signature

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to