Hi,
 Context: We are storing Kannel's dlr in a MySQL database. This database is
used for some other applications as well and we set wait_timeout=10 (
http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_wait_timeout).
Since this is a "global" settings I can't change it for the Kannel DLR
database only. While I could create a separate installation for the mysql
dlr, there is some benefits for me to keep it together :)

Here's the log of the error happening:
2009-07-27 06:26:31 [3489] [3] ERROR: MYSQL: database check failed!
2009-07-27 06:26:31 [3489] [3] ERROR: MYSQL: Lost connection to MySQL server
during query
2009-07-27 06:26:31 [3489] [3] INFO: MYSQL: Connected to server at
db-XXXX01.
2009-07-27 06:26:31 [3489] [3] INFO: MYSQL: server version
5.0.51a-15~bpo40+1-log, client version 4.1.11.

First, the point I want to confirm :

I see that there is a check method defined as mysql_ping in dbpool_mysql.c.
>From what I found, it is only triggered when there is something happening on
the mysql connections.(I'm I correct in this assumption?) So in my case, if
nothing is happening for 10s, MySQL disconnect us and I have this error
logged on the next activity to use the MySQL connection.

All in all this isn't that much of an issue, however we suffered a small
issue where our database maxed out his connections and while we expected
some stuff to failed, we didn't expect Kannel to died on us because of it:
2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: database check failed!
2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: Lost connection to MySQL server
during query
2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: can not connect to database!
2009-07-27 14:40:09 [3489] [3] ERROR: MYSQL: Too many connections
2009-07-27 14:40:10 [3489] [29] PANIC: DBPOOL: Deadlock detected!!!
2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox(gw_panic+0xcc)
[0x80cc73c]
2009-07-27 14:40:10 [3489] [29] PANIC:
/usr/sbin/bearerbox(dbpool_conn_consume+0xec) [0x80bf71c]
2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox [0x805e59e]
2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox [0x805eaf4]
2009-07-27 14:40:10 [3489] [29] PANIC: /usr/sbin/bearerbox(dlr_add+0x33b)
[0x805d95b]
<....>

Since not being able to store the DLR in MySQL crashed Kannel, I believe
some effort should be put so that kannel can maintain the connection alive
correctly. I see 2 possible solutions:
1 - Have a timer that exec the mysql_check every n seconds (Configurable of
course)
2 - Send a set wait_timeout on connection to override the value from MySQL,
however in this case I don't know what we should set it to since the
mysql_ping is based on activity... (Maybe 8h ? since it's the default MySQL
value)

Solution 1 seems the best way to handle it, however solutions 2 may be
easier to implement (Comment on those?)

I may tried one of those change based on the feedback received, so feel free
to let me know your toughts :)
-- 
Math
aka ROunofF

[email protected]

Reply via email to