Bug Tracker item #2881599, was opened at 2009-10-19 10:40 Message generated for change (Comment added) made by gaasmann You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2881599&group_id=250683
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: daemon Group: v3.8.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nicolas Haller (gaasmann) Assigned to: Stevan Bajic (sbajic) Summary: failure with Dpsam and Postgresql Initial Comment: HI, I have a problem with DSPAM 3.8.0 and I don't know if it's resolved in 3.9.0 (I don't see any clue in ChangeLog). I have a process chain to analyse spam: Internet----SMTP/25----Postfix----LMTP/4242----DSPAM----SMTP/10024----amavisd-new----SMTP/10025----Postfix----transport/deliver Dspam works with a PgSQL DB. When Pg fails(connection refused, DB can't answer to a request, ...) Dspam reject all future email. It send an error "530 5.2.0 Message is empty. Aborting." in the LMTP transaction. I must restart DSPAM daemon to accept new mail correctly. If you need more infos, tell me. Thanks ---------------------------------------------------------------------- Comment By: Nicolas Haller (gaasmann) Date: 2009-10-23 19:27 Message: Hallo Stevan, I put your patch on a less busy mail server (because we are Friday) this morning. So after 10 hours, memory allocated by dspam (VSZ) increase from 11632 to 23408. I don't see odd or new lines on dspam log and dspam debug log. For your previous questions: - I use gcc 4.2.1 - My ./configure flags look like with ./configure --sysconfdir=/usr/local/etc --with-logdir=/var/log/dspam --with-dspam-home=/var/db/dspam --with-dspam-home-owner=root --with-dspam-home-group=mail --with-dspam-home-mode=0770 --with-dspam-owner=root --with-dspam-group=mail --enable-syslog --enable-debug --enable-daemon --with-pgsql-includes=/usr/local/include --with-pgsql-libraries=/usr/local/lib --with-storage-driver=pgsql_drv --enable-virtual-users --with-dspam-mode=4510 --prefix=/usr/local --mandir=/usr/local/man --infodir=/usr/local/info/ --build=i386-portbld-freebsd7.2 - compiler flags looks like this: -DHAVE_CONFIG_H -DLOGDIR=\"/var/log/dspam\" -DCONFIG_DEFAULT=\"/usr/local/etc/dspam.conf\" -D_REENTRANT -D_POSIX_PTHREAD_SEMANTICS -D_GNU_SOURCE -I. -I. -I../../src -I/usr/local/include -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations - and ldflags -pthread -L/usr/local/lib ../.libs/libdspam.so -lpq -lm -pthread -Wl,-rpath -Wl,/usr/local/lib Thanks ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-22 19:34 Message: Hallo Nicolas, I quickly went over the PostgreSQL driver for DSPAM and I am not 100% confident that the driver does not have memory leaks. I attached a patch into this bug report that should close most (I see two additional places where a memory leak could happen but I have to test that first) of them. Could you try that patch and report back if you have any issues after you have patched the PostgreSQL driver? Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-22 16:33 Message: Hallo Nicolas, > Also, you must know that the box which use dspam analyses > between 200000 and 250000 mails per day :-) > That should not affect DSPAM at all and definitely should not start DSPAM with almost 7MB rss after a fresh start. What compiler are you using? What compiler flags are you using? What DSPAM configuration flags are you using? Could you post your dspam.conf here? What PostgreSQL version? Maybe I am comparing apples with oranges here? I never used a *BSD build of DSPAM and maybe the *BSD build behaves differently then a Linux build? How is the memory usage comparable to 3.8.0? Did it used there as well that much memory or had 3.8.0 a lower footprint? btw: That "_pgsql_drv_getpwnam" thing is just to cache username/groupname and should not use more then a bunch of bytes in total. Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- Comment By: Nicolas Haller (gaasmann) Date: 2009-10-22 16:20 Message: Hallo Stevan, I have restarted dspam daemon on my box and this is the result: Before restart: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 70820 0.0 9.8 218104 205612 p2- S Tue06PM 61:20.52 /usr/local/bin/dspam --daemon After restart: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 13080 0.0 0.3 13688 6796 p0 S 4:10PM 0:00.65 /usr/local/bin/dspam --daemon Is there some cache mechanism? I see some message like "_pgsql_drv_getpwnam returning cached name vscan." in debug output. Also, you must know that the box which use dspam analyses between 200000 and 250000 mails per day :-) thanks ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-22 15:13 Message: Hallo Nicolas, Good to hear that your "Message is empty" error is gone. And good to hear that DSPAM 3.9.0 is faster. :) Regarding the memory usage: How have you measured that DSPAM is taking that much memory? I would be very much surprised to see DSPAM using that much memory. If I check on my PostgreSQL installation then DSPAM is using very low amount of memory: nemesis ~ # ps -lyC dspam,postgres -U dspam,postgres -u dspam,postgres --sort=rss S UID PID PPID C PRI NI RSS SZ WCHAN TTY TIME CMD S 70 5107 5064 0 80 0 240 130283 poll_s ? 00:08:10 postmaster S 0 5185 1 0 80 0 276 1159 poll_s ? 00:00:10 dspam S 70 5109 5064 0 80 0 348 2500 poll_s ? 00:02:23 postmaster S 70 5198 5064 0 80 0 348 130427 unix_s ? 00:00:00 postmaster S 70 5199 5064 0 80 0 348 130427 unix_s ? 00:00:00 postmaster S 70 5106 5064 0 80 0 404 130283 poll_s ? 00:12:01 postmaster S 70 5108 5064 0 80 0 500 130319 poll_s ? 00:01:08 postmaster S 70 5064 1 0 80 0 620 130222 poll_s ? 00:00:37 postmaster S 70 5193 5064 0 80 0 632 130491 unix_s ? 00:00:00 postmaster nemesis ~ # Doing the same on a system with MySQL: theia ~ # ps -lyC dspam,mysqld -U dspam,mysql -u dspam,mysql --sort=rss S UID PID PPID C PRI NI RSS SZ WCHAN TTY TIME CMD S 0 12785 1 0 80 0 2644 9857 - ? 00:01:12 dspam S 60 6190 1 0 80 0 353496 278772 - ? 00:20:43 mysqld theia ~ # And another system with MySQL: nyx ~ # ps -lyC dspam,mysqld -U dspam,mysql -u dspam,mysql --sort=rss S UID PID PPID C PRI NI RSS SZ WCHAN TTY TIME CMD S 0 13945 1 0 80 0 3360 9958 - ? 00:01:09 dspam S 60 28174 1 0 80 0 447804 279105 - ? 00:19:18 mysqld nyx ~ # 205MB for DSPAM is way, way to much. Can you check again and post an output like above showing that indeed DSPAM is using all that memory? btw: I have not taken my mound to full when promising you that 3.9.0 will be faster and that probably your "Message is empty" error will be gone. I am happy that we as a community can push DSPAM to be faster and more stable. btw2: I will still leave this bug report open for at least another week or two. I really want to be sure that the "Message is empty" error is gone. Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- Comment By: Nicolas Haller (gaasmann) Date: 2009-10-22 14:22 Message: Hallo Stevan, My problems with error -5 and "Message is empty" message seem gone. That's great. Also it seems Dspam git is faster than dspam 3.8.0. However, I notice the memory footprint increases with time. Dspam daemon allocates 123MB 2 day ago and now it takes 205MB. Is it normal or it's revealing of a memory leak somewhere? Thanks ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-20 18:54 Message: Hallo Nicolas I will leave this case here open and wait for your feedback. Don't hesitate to post any issue you are facing with 3.9.0. Regardless of the problem area. If you can then post impression you have regarding stability and speed of 3.9.0 into the mailing list. I think you will see a speed difference in 3.9.0 in relation to PostgreSQL. Stability should as well be better then with 3.8.0. Please don't forget to update the database schema as mentioned in the UPGRADING document found in the root directory of the source tree (or here -> http://dspam.git.sourceforge.net/git/gitweb.cgi?p=dspam/dspam;a=blob_plain;f=UPGRADING;hb=HEAD <-). Thanks for giving 3.9.0 a try. Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- Comment By: Nicolas Haller (gaasmann) Date: 2009-10-20 18:29 Message: Hallo Stevan I deploy the git version. I have compiled it with FreeBSD configure's flags. All is right for the moment. I will watch it's behavior when db fail. Thanks, ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-20 00:05 Message: Hallo Nicolas Wait a minute. DSPAM 3.8.0. Right? Please try 3.9.0 because in commit dc209e5ef35bd128c921c3e305defe1263513a0a we have fixed a issue with wrong return codes from the function send_socket() and that function is returning errors in your case. That is the reason you see OK and then in the next line a error (-5) but in reality the data send to the socked was okay. Could you try with 3.9.0 GIT and see if the error still is there? Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- Comment By: Nicolas Haller (gaasmann) Date: 2009-10-19 16:59 Message: Hallo, You're right about reconnecting. Is it possible dspam sends query without be sure the connection is ok? About delivering, I understand what you mean but the chain works correctly before and after DB problems. In fact, I don't really understand why Dspam seems to send "520 Message is empty" when PostgreSQL fails. When I restart DSPAM, all is working well: there is no problem with "delivery target". Also, I don't understand what is the log line "Received error in response to HELO" or "Oct 19 09:50:21 enigma dspam[62346]: Got error -5 in response to DATA: 250 2.1.5 Recipient <n...@noc.nerim.net> OK^M" Here it seems the "delivery target" said OK, what is the error -5 dspam is talking about? I don't know how to help you more but tell me if you need anything. Thanks ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-19 14:35 Message: Hallo Nicolas @Reconnecting: You wrote that if you restart DSPAM then everything works again. So this means that somehow DSPAM can not connect to PostgreSQL and if we would RECONNECT the stalled PostgreSQL connections then everything should work again. Right? Regarding this here: > I know only a little about profiles but I have only one DB server. In > fact, I prefer DSPAM forward message on failure instead of rejecting it. > The errors "Delivery failed completely" you are seeing in your log are in fact errors where DSPAM is failing to deliver the message via SMTP or LMTP. I don't know how we can forward a message when DSPAM has trouble delivering a message via LMTP or SMTP. So your request to FORWARD a message in case of PostgreSQL connection errors are okay but DSPAM can not forward a message when the target refuses to accept a message. Do you understand what I mean? The errors about failed delivery are related to your dspam.conf entries starting with "Delivery". Assuming you would have the following in your dspam.conf: DeliveryHost 192.168.0.10 DeliveryPort 10026 DeliveryProto SMTP Then the SMTPD listening on 192.168.0.10 at port 10026 is the one that has an issue. Not DSPAM. And since that service refuses to accept messages, DSPAM can not deliver. Let alone forward. Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- Comment By: Nicolas Haller (gaasmann) Date: 2009-10-19 12:16 Message: Hi Stevan, I don't understand what you mean about autoconnection. Dspam is configured as a daemon with a pool of Pg connection (PgSQLConnectionCache 20). The PgSQL is on a different host and they are using TCP/IP. The database can fail on different manner: - Server is down, connection are refused - The server is here but it can't execute a query. (62346: server closed the connection unexpectedly. This probably means the server terminated abnormally before or while processing the request.) I know only a little about profiles but I have only one DB server. In fact, I prefer DSPAM forward message on failure instead of rejecting it. Thanks for this quick answer. I don't have debug output but DSPAM said this on failure: ct 19 09:50:20 enigma dspam[62346]: Received error in response to HELO: Oct 19 09:50:20 enigma dspam[62346]: Delivery failed completely Oct 19 09:50:21 enigma dspam[62346]: Got error -5 in response to DATA: 250 2.1.5 Recipient <n...@noc.nerim.net> OK^M Oct 19 09:50:21 enigma dspam[62346]: Delivery failed completely Oct 19 09:50:21 enigma dspam[62346]: Got error -5 in response to RCPT TO: 250 2.1.0 Sender <na...@enigma.nerim.net> OK^M Oct 19 09:50:21 enigma dspam[62346]: Delivery failed completely Oct 19 09:50:21 enigma dspam[62346]: Got error -5 in response to MAIL FROM: Oct 19 09:50:21 enigma dspam[62346]: Delivery failed completely Oct 19 09:51:11 enigma dspam[62346]: Got error -5 in response to RCPT TO: 250 2.1.0 Sender <brutall...@advanceup.com> OK^M Oct 19 09:51:11 enigma dspam[62346]: Delivery failed completely Oct 19 09:51:11 enigma dspam[62346]: Got error -5 in response to message data: 354 End data with <CR><LF>.<CR><LF>^M Oct 19 09:51:11 enigma dspam[62346]: Delivery failed completely ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-19 11:39 Message: Hallo Nicolas As far as I know, PostgreSQL does not offer an autoconnect feature as MySQL does. So the only way to to fix that issue in the PostgreSQL driver would be to shut down all (or just the ones not being able to connect) the connections from DSPAM to PostgreSQL and reopen them. Is it possible for you to post some more debugging information or error logs indicating where the error inside DSPAM is happening? How is your setup? Does PostgreSQL run on the same system as DSPAM? How do you connect to PostgreSQL? Using TCP/IP or do you use sockets? Do you know if your PostgreSQL instance has gone away while DSPAM was running? Have you tried to use connection profiles with DSPAM? Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2881599&group_id=250683 ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Dspam-devel mailing list Dspam-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-devel