Daniel/Henning,
The root cause of the crash lies in the sqlops/sql_api.c file within the
function sql_connect. I pasted that function below so we can reference it when
reviewing my notes below it:
int sql_connect(int mode)
{
sql_con_t *sc;
sc = _sql_con_root;
while(sc)
{
if (db_bind_mod(&sc->db_url, &sc->dbf))
{
LM_DBG("database module not found for [%.*s]\n",
sc->name.len, sc->name.s);
return -1;
}
if (!DB_CAPABILITY(sc->dbf, DB_CAP_RAW_QUERY))
{
LM_ERR("database module does not have DB_CAP_ALL
[%.*s]\n",
sc->name.len, sc->name.s);
return -1;
}
sc->dbh = sc->dbf.init(&sc->db_url);
if (sc->dbh==NULL)
{
if(mode) {
LM_ERR("failed to connect to the database
[%.*s]\n",
sc->name.len, sc->name.s);
return -1;
} else {
LM_INFO("failed to connect to the database
[%.*s] - trying next\n",
sc->name.len, sc->name.s);
}
}
sc = sc->next;
}
return 0;
}
Notice the if(mode) clause. Looks like the statements within it need to be
reversed. That is, if mode, then continue trying connecting to other database
instances. If not mode, then return false immediately.
The setup for the crash begins to manifest if you have more database instances
to connect to in the sql_con_t linked list when the code encounters a database
instance it can't connect to and returns false.
If at a later time one of those database instances (ones remaining in the
linked list that we weren't able to connect to because of a pre-mature return)
has a sql submitted to it, the sql_reconnect function gets called because the
connection structure has been initialized for that database instance but
unfortunately because there was no actual attempt to connect made in
sql_connect, the sc->dbf member is null. Basically this piece of code never
gets executed for the remaining database instances in the linked list with the
sql_connect function :
if (db_bind_mod(&sc->db_url, &sc->dbf))
sc->dbf remains null and access to it via sql_reconnect creates the
segmentation fault.
This is clearly seen in the gdb output.
I have tested the code with reversing the logic in the if(mode) statement and
all works well.
If you agree with my analysis, please let me know how we should proceed here.
Either i can make the change or you can. I am fine with either.
Thanks,
Karthik
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/1821#issuecomment-479989062_______________________________________________
Kamailio (SER) - Development Mailing List
[email protected]
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev