Hi.

I am getting some issues at incoming smtp connections for some weeks.

esmtpd MAXDAEMONS is set to 350 and I at peak times I used to readh at max 150 
simultaneous courieresmtpd process running.
But now I am reaching the top limit many times a day, then my servers stop 
acceping new connections.

After collecting some data on this problem it seems is related to the raising 
number of infected machines running smtp Zombies. This Zombies keeps trying to 
send spam, and, even if I deny all try, they never close the smtp connection. 
Courieresmtpd process keeps running for hours. At some poing submit process 
kills himself at the hard 1800 seconds alarm timeout. But even with the submit 
at defunct state courieresmtpd process keeps running.

My only way to keep the low number of courieresmtpd process is to block the 
originating IP at firewall or at a smtpaccess DENY at couriertcpd. If the send 
is blocked at some other point (BLACKLIST, BOTH, SPF, smtpaccess BLOCK, RELAY 
DENY or USER UNKNOWN) the submit process had already started and the 
courieresmtpd process stay locked for hours. I am loosing this game. I already 
blocked more 500 /24 nets but the number of locked connections keeps raising 
until I restart esmtpd.

Zombies is what I suspect, but seems smtp incomming connections are getting 
stucked for other reasons too.
Other strange issue that start happening is lots of "writev: Broken pipe" 
errors. May be its related to the Zombies, but I have seen this error happening 
on normal smtp connections too.

Is this happening to any other?

Anyone have an idea of how to get this problem fixed. Seems I need some 
configuration on courieresmtpd process to a faster timeout, or a limit of 
errors each smtp connection can have, or a time limit where a smtp conection 
shouls start the DATA transfer.

Below is a lot of debugging information I got. Sorry for the long post.

Marcus Pereira


***************************
Showing lots of submit process in defunct state:
# ps f -o pid,ppid,etime,cmd -C courieresmtpd,submit | grep -B1 defunct
18207 12590       00:09 /usr/sbin/courieresmtpd
18587 18207       00:01  \_ [submit] <defunct>
--
 8586 12590       06:43 /usr/sbin/courieresmtpd
17753  8586       00:23  \_ [submit] <defunct>
--
32301 12590       30:38 /usr/sbin/courieresmtpd
32324 32301       30:38  \_ [submit] <defunct>
--
17572 12590       39:25 /usr/sbin/courieresmtpd
17700 17572       39:21  \_ [submit] <defunct>
17383 12590       39:32 /usr/sbin/courieresmtpd
17397 17383       39:31  \_ [submit] <defunct>
15410 12590       40:29 /usr/sbin/courieresmtpd
15497 15410       40:26  \_ [submit] <defunct>
13700 12590       41:26 /usr/sbin/courieresmtpd
13746 13700       41:24  \_ [submit] <defunct>
13438 12590       41:33 /usr/sbin/courieresmtpd
13449 13438       41:33  \_ [submit] <defunct>
10004 12590       43:27 /usr/sbin/courieresmtpd
10013 10004       43:27  \_ [submit] <defunct>
 8705 12590       44:17 /usr/sbin/courieresmtpd
 8887  8705       44:11  \_ [submit] <defunct>
 7493 12590       45:10 /usr/sbin/courieresmtpd
 7716  7493       45:00  \_ [submit] <defunct>
 7172 12590       45:27 /usr/sbin/courieresmtpd
 7195  7172       45:26  \_ [submit] <defunct>
 6222 12590       46:02 /usr/sbin/courieresmtpd
 6304  6222       45:58  \_ [submit] <defunct>
 5028 12590       46:51 /usr/sbin/courieresmtpd
 5032  5028       46:51  \_ [submit] <defunct>
 5025 12590       46:51 /usr/sbin/courieresmtpd
 5031  5025       46:51  \_ [submit] <defunct>
 4328 12590       47:15 /usr/sbin/courieresmtpd
 4355  4328       47:14  \_ [submit] <defunct>
 2213 12590       48:42 /usr/sbin/courieresmtpd
 2269  2213       48:40  \_ [submit] <defunct>
 2045 12590       48:46 /usr/sbin/courieresmtpd
 2178  2045       48:44  \_ [submit] <defunct>
 1804 12590       48:55 /usr/sbin/courieresmtpd
 1856  1804       48:53  \_ [submit] <defunct>
 1726 12590       48:59 /usr/sbin/courieresmtpd
 1740  1726       48:58  \_ [submit] <defunct>
  980 12590       49:22 /usr/sbin/courieresmtpd
 2634   980       48:25  \_ [submit] <defunct>
  581 12590       49:34 /usr/sbin/courieresmtpd
  593   581       49:33  \_ [submit] <defunct>
31665 12590       50:29 /usr/sbin/courieresmtpd
31725 31665       50:28  \_ [submit] <defunct>
29477 12590       51:47 /usr/sbin/courieresmtpd
16372 29477       01:04  \_ [submit] <defunct>
28991 12590       52:06 /usr/sbin/courieresmtpd
29053 28991       52:05  \_ [submit] <defunct>
28727 12590       52:18 /usr/sbin/courieresmtpd
28740 28727       52:17  \_ [submit] <defunct>
28483 12590       52:29 /usr/sbin/courieresmtpd
28518 28483       52:27  \_ [submit] <defunct>
28121 12590       52:44 /usr/sbin/courieresmtpd
31412 28121       50:41  \_ [submit] <defunct>
27343 12590       53:11 /usr/sbin/courieresmtpd
27397 27343       53:09  \_ [submit] <defunct>
27338 12590       53:11 /usr/sbin/courieresmtpd
27347 27338       53:11  \_ [submit] <defunct>
25839 12590       54:14 /usr/sbin/courieresmtpd
25880 25839       54:13  \_ [submit] <defunct>
24281 12590       55:13 /usr/sbin/courieresmtpd
24392 24281       55:11  \_ [submit] <defunct>
23157 12590       55:54 /usr/sbin/courieresmtpd
23210 23157       55:52  \_ [submit] <defunct>
21497 12590       56:58 /usr/sbin/courieresmtpd
21502 21497       56:57  \_ [submit] <defunct>
20528 12590       57:40 /usr/sbin/courieresmtpd
20564 20528       57:38  \_ [submit] <defunct>
20255 12590       57:50 /usr/sbin/courieresmtpd
20256 20255       57:50  \_ [submit] <defunct>
19662 12590       58:16 /usr/sbin/courieresmtpd
19685 19662       58:15  \_ [submit] <defunct>
18499 12590       58:56 /usr/sbin/courieresmtpd
18562 18499       58:55  \_ [submit] <defunct>
16709 12590    01:00:03 /usr/sbin/courieresmtpd
16770 16709    01:00:01  \_ [submit] <defunct>
16433 12590    01:00:16 /usr/sbin/courieresmtpd
16447 16433    01:00:16  \_ [submit] <defunct>
15893 12590    01:00:45 /usr/sbin/courieresmtpd
15927 15893    01:00:44  \_ [submit] <defunct>
15785 12590    01:00:49 /usr/sbin/courieresmtpd
15845 15785    01:00:47  \_ [submit] <defunct>
15079 12590    01:01:11 /usr/sbin/courieresmtpd
15150 15079    01:01:09  \_ [submit] <defunct>
14685 12590    01:01:28 /usr/sbin/courieresmtpd
14720 14685    01:01:27  \_ [submit] <defunct>
13389 12590    01:02:21 /usr/sbin/courieresmtpd
13426 13389    01:02:20  \_ [submit] <defunct>
12599 12590    01:03:00 /usr/sbin/courieresmtpd
 9451 12599       43:50  \_ [submit] <defunct>
###############################################################

Getting, for example, strace for courieresmtpd process 20528
# tail -n 27 strace.20528
10:52:21 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:52:21 nanosleep({128, 0}, {128, 0})              = 0
10:54:29 sendto(3, "<19>Apr 17 10:54:29 courieresmtpd: error,relay=::f"..., 
189, MSG_NOSIGNAL, NULL, 0)                    = 189
10:54:29 write(1, "511 http://www.spamhaus.org/query/bl?ip=190.42.107";..., 55) 
= -1 EPIPE (Broken pipe)
10:54:29 --- SIGPIPE (Broken pipe) @ 0 (0) ---
10:54:29 write(5, "[email protected]\t\t\n", 36) = 36
10:54:29 read(6, "511 http://www.spamhaus.org/query/bl?ip=190.42.107";..., 4096) 
= 54
10:54:29 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:54:29 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
10:54:29 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:54:29 nanosleep({128, 0}, {128, 0}) = 0
10:56:37 sendto(3, "<19>Apr 17 10:56:37 courieresmtpd: error,relay=::f"..., 
189, MSG_NOSIGNAL, NULL, 0) = 189
10:56:37 write(1, "511 http://www.spamhaus.org/query/bl?ip=190.42.107";..., 55) 
= -1 EPIPE (Broken pipe)
10:56:37 --- SIGPIPE (Broken pipe) @ 0 (0) ---
10:56:37 write(5, "[email protected]\t\t\n", 35) = 35
10:56:37 read(6, "511 http://www.spamhaus.org/query/bl?ip=190.42.107";..., 4096) 
= 54
10:56:37 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:56:37 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
10:56:37 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:56:37 nanosleep({128, 0}, {128, 0}) = 0
10:58:45 sendto(3, "<19>Apr 17 10:58:45 courieresmtpd: error,relay=::f"..., 
188, MSG_NOSIGNAL, NULL, 0) = 188
10:58:45 write(1, "511 http://www.spamhaus.org/query/bl?ip=190.42.107";..., 55) 
= -1 EPIPE (Broken pipe)
10:58:45 --- SIGPIPE (Broken pipe) @ 0 (0) ---
10:58:45 sendto(3, "<22>Apr 17 10:58:45 courieresmtpd: error,relay=::f"..., 
107, MSG_NOSIGNAL, NULL, 0) = 107
10:58:45 select(2, NULL, [1], NULL, {4800, 0})    = ? ERESTARTNOHAND (To be 
restarted)
11:18:10 --- SIGCHLD (Child exited) @ 0 (0) ---
11:18:10 select(2, NULL, [1], NULL, {3634, 984000}
#########################

And for the child submit process pid 20564
# tail -n 27 strace.20564
10:54:29 lseek(6, 0, SEEK_SET) = 0
10:54:29 read(6, "rootfs / rootfs rw 0 0\nnone /sys sysfs rw,nosuid,n"..., 
1024) = 874
10:54:29 stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
10:54:29 close(6) = 0
10:54:29 munmap(0x7f368cbd3000, 4096) = 0
10:54:29 write(1, "511 http://www.spamhaus.org/query/bl?ip=190.42.107";..., 54) 
= 54
10:54:29 poll([{fd=0, events=POLLIN|POLLERR|POLLHUP}], 1, -1)                   
     = 1 ([{fd=0, revents=POLLIN}])
10:56:37 read(0, "[email protected]\t\t\n", 8192) = 35
10:56:37 statfs(".", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, 
f_blocks=34151844, f_bfree=32936550, f_bavail=31201734, f_files=8675328, 
f_ffree=8559136, f_fsid={162176456, -721074552}, f_namelen=255, f_frsize=4096}) 
= 0
10:56:37 stat(".", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0
10:56:37 open("/proc/mounts", O_RDONLY) = 6
10:56:37 fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
10:56:37 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x7f368cbd3000
10:56:37 read(6, "rootfs / rootfs rw 0 0\nnone /sys sysfs rw,nosuid,n"..., 
1024) = 874
10:56:37 read(6, "", 1024) = 0
10:56:37 lseek(6, 0, SEEK_SET)                      = 0
10:56:37 read(6, "rootfs / rootfs rw 0 0\nnone /sys sysfs rw,nosuid,n"..., 
1024) = 874
10:56:37 stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
10:56:37 close(6) = 0
10:56:37 munmap(0x7f368cbd3000, 4096) = 0
10:56:37 write(1, "511 http://www.spamhaus.org/query/bl?ip=190.42.107";..., 54) 
= 54
10:56:37 poll([{fd=0, events=POLLIN|POLLERR|POLLHUP}], 1, -1) = ? 
ERESTART_RESTARTBLOCK (To be restarted)
11:18:10 --- SIGALRM (Alarm clock) @ 0 (0) ---
11:18:10 rt_sigaction(SIGINT, {SIG_DFL, [INT], SA_RESTORER|SA_RESTART, 
0x7f368b8ac190}, {0x411620, [INT], SA_RESTORER|SA_RESTART, 0x7f368b8ac190}, 8) 
= 0
11:18:10 getpid()                       = 20564
11:18:10 kill(20564, SIGKILL)           = 0
11:18:10 +++ killed by SIGKILL +++

###############################
mail.log for this IP:
# grep 190.42.107 /var/log/mail.log
Apr 17 10:48:08 cuma courieresmtpd: started,ip=[::ffff:190.42.107.41]
Apr 17 10:48:20 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:48:36 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:49:08 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:50:13 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:52:21 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:54:29 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:56:37 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:58:45 cuma courieresmtpd: 
error,relay=::ffff:190.42.107.41,from=<[email protected]>,to=<[email protected]>:
 511 http://www.spamhaus.org/query/bl?ip=190.42.107.41
Apr 17 10:58:45 cuma courieresmtpd: error,relay=::ffff:190.42.107.41,msg="502 
ESMTP command error",cmd: DATA

##################################
Broken pipe logs. I am getting thousands of this message per hour:
.
.
.
Apr 17 11:04:05 cuma courieresmtpd: 
error,relay=::ffff:194.0.252.241,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:04:18 cuma courieresmtpd: 
error,relay=::ffff:201.20.24.36,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:05:13 cuma courieresmtpd: 
error,relay=::ffff:201.17.146.71,msg="writev: Broken pipe",cmd: RSET
Apr 17 11:05:13 cuma courieresmtpd: 
error,relay=::ffff:81.184.51.141,msg="writev: Broken pipe",cmd: DATA
Apr 17 11:05:32 cuma courieresmtpd: 
error,relay=::ffff:201.17.146.71,msg="writev: Broken pipe",cmd: RSET
Apr 17 11:06:10 cuma courieresmtpd: 
error,relay=::ffff:201.17.146.71,msg="writev: Broken pipe",cmd: RSET
Apr 17 11:06:49 cuma courieresmtpd: 
error,relay=::ffff:208.43.194.2,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:07:07 cuma courieresmtpd: 
error,relay=::ffff:64.118.95.132,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:07:49 cuma courieresmtpd: 
error,relay=::ffff:194.0.252.241,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:11:04 cuma courieresmtpd: 
error,relay=::ffff:200.162.176.8,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:11:30 cuma courieresmtpd: 
error,relay=::ffff:174.133.25.2,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:11:35 cuma courieresmtpd: 
error,relay=::ffff:194.0.252.241,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:12:28 cuma courieresmtpd: 
error,relay=::ffff:72.232.211.2,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:12:35 cuma courieresmtpd: 
error,relay=::ffff:72.232.211.2,msg="writev: Broken pipe",cmd: QUIT
Apr 17 11:13:09 cuma courieresmtpd: 
error,relay=::ffff:81.215.80.190,msg="writev: Broken pipe",cmd: QUIT
.
.
.

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to