Hi, Context: We are storing Kannel's dlr in a MySQL database. This database is used for some other applications as well and we set wait_timeout=10 ( http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_wait_timeout). Since this is a "global" settings I can't change it for the Kannel DLR database only. While I could create a separate installation for the mysql dlr, there is some benefits for me to keep it together :)
Here's the log of the error happening: 2009-07-27 06:26:31 [3489] [3] ERROR: MYSQL: database check failed! 2009-07-27 06:26:31 [3489] [3] ERROR: MYSQL: Lost connection to MySQL server during query 2009-07-27 06:26:31 [3489] [3] INFO: MYSQL: Connected to server at db-XXXX01. 2009-07-27 06:26:31 [3489] [3] INFO: MYSQL: server version 5.0.51a-15~bpo40+1-log, client version 4.1.11. First, the point I want to confirm : I see that there is a check method defined as mysql_ping in dbpool_mysql.c. >From what I found, it is only triggered when there is something happening on the mysql connections.(I'm I correct in this assumption?) So in my case, if nothing is happening for 10s, MySQL disconnect us and I have this error logged on the next activity to use the MySQL connection. All in all this isn't that much of an issue, however we suffered a small issue where our database maxed out his connections and while we expected some stuff to failed, we didn't expect Kannel to died on us because of it: 2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: database check failed! 2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: Lost connection to MySQL server during query 2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: can not connect to database! 2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: Too many connections 2009-07-27 14:40:10 [3489] [29] PANIC: DBPOOL: Deadlock detected!!! 2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox(gw_panic+0xcc) [0x80cc73c] 2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox(dbpool_conn_consume+0xec) [0x80bf71c] 2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox [0x805e59e] 2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox [0x805eaf4] 2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox(dlr_add+0x33b) [0x805d95b] <....> Since not being able to store the DLR in MySQL crashed Kannel, I believe some effort should be put so that kannel can maintain the connection alive correctly. I see 2 possible solutions: 1 - Have a timer that exec the mysql_check every n seconds (Configurable of course) 2 - Send a set wait_timeout on connection to override the value from MySQL, however in this case I don't know what we should set it to since the mysql_ping is based on activity... (Maybe 8h ? since it's the default MySQL value) Solution 1 seems the best way to handle it, however solutions 2 may be easier to implement (Comment on those?) I may tried one of those change based on the feedback received, so feel free to let me know your toughts :) -- Math aka ROunofF [email protected]
