Greetings:
I don't know if anyone in here is encountering this problem
(yet), but it has
been affecting me for the past few weeks - ever since I upgraded my MySQL
server to 5.0.19. It took quite a bit of digging, but I believe
I have found
the problem.
To describe the problem: when you run vpopmail in MySQL mode, with
courier-authdaemond and MySQL v5.0 or later, you will find that
for the first 8
hours, everything works just fine, but after 8 hours, nobody will
be able to
authenticate to the email server and you will see "MySQL server
has gone away"
errors in the maillog.
The cause of the problem is that in MySQL 5.0 (and probably some
4.1 releases),
MySQL implements a new timeout definition for connections, a timeout that
ignores traffic. This timeout will shut down the socket thread
from the MySQL
side. The problem is that the client (vchkpw and friends) do not
know/understand about this timeout and socket termination so they
continue on
in ignorate bliss until they try to send to the socket and find
that it's no
longer valid - literally "the server has gone away".
The fix is to simply destroy the internal flags and file handles
related to that
socket, rebuild a new one, and try again.
The included patch (inline and attached) implements this
fix. Please note that
there doesn't appear to be any way at this time to disable the
timeout feature
in MySQL.
Please feel free to comment, tear apart, beat up, or otherwise
rip to shreads my
fix!
--
Ron Gage
(LPIC1 MCP A+ Net+)
Westland, Michigan
--- vmysql.c~ 2006-05-29 10:17:20.000000000 -0400
+++ vmysql.c 2006-05-29 10:17:20.000000000 -0400
@@ -465,7 +465,31 @@
);
if (mysql_query(&mysql_read,SqlBufRead)) {
fprintf(stderr, "vmysql: sql error[3]: %s\n",
mysql_error(&mysql_read));
- return(NULL);
+ /* Ron Gage - May 29, 2006 - With newer versions of
MySQL, there is
such a thing
+ as a connection timeout regardless of activity. By
default under MySQL
5, this
+ timeout is 28800 seconds (8 hours). If your vpopmail
system runs fine
for the
+ first 8 hours, then stops authenticating, this timeout
is your problem
(especially
+ under authdaemond).
+
+ What this code does is when an error is encountered, it
first tries to
drop and
+ rebuild a connection to the SQL server and tries
again. If this second
attempt
+ fails, then something other than the connection timeout
is the problem.
This fix
+ need to be implemented in other places but in my setup
(Slackware 10.2,
netqmail,
+ vpopmail, courier-authdaemond, courier-imapd and a few
others), this is
always where
+ the auth attempt died with a "SQL server has gone away" error.
+ */
+
+ fprintf(stderr, "Attempting to rebuild connection to SQL
server\n");
+ vclose();
+ verrori = 0;
+ if ( (err=vauth_open_read()) != 0 ) {
+ verrori = err;
+ return(NULL);
+ }
+ if (mysql_query(&mysql_read, SqlBufRead)) {
+ fprintf (stderr, "vmysql: connection rebuild failed: %s\n",
mysql_error(&mysql_read));
+ return(NULL);
+ }
}
if (!(res_read = mysql_store_result(&mysql_read))) {
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
<vmysql.diff>