I have had 3 OpenSIPS server crashes in the last week. All were due to
segmentation faults. I was not able to capture core dumps; I am configuring
that now to catch the next crash.
My logs leading up to the crash are full of errors from fix_route_dialog()
complaining about invalid URIs for sequential requests:
Jul 26 19:34:02 [220] ERROR:dialog:fix_route_dialog: Failed to parse SIP uri
Jul 26 19:34:02 [220] ERROR:core:parse_uri: bad uri, state 0 parsed: <ip:1> (4)
/ <ip:10.18.8.18:5060;ftag=gK0448f137;lr;r2=on>> (44)
Jul 26 19:11:19 [218] ERROR:dialog:fix_route_dialog: Failed to parse SIP uri
Jul 26 19:11:19 [218] ERROR:core:parse_uri: bad uri, state 0 parsed: <b0i2> (4)
/ <b0i2yjor;transport=udp<sip:10.18.8.17:5060;ftag=7207ce89;lr;r2=on> (65)
Jul 26 17:43:19 [220] ERROR:dialog:fix_route_dialog: Failed to parse SIP uri
Jul 26 17:43:19 [220] ERROR:core:parse_uri: bad uri, state 0 parsed: <ervi> (4)
/ <ervice_id6fdbc70f-2438-4726-807c-0d081df4d87> (44)
Many times the “URI” displayed in the error message is actually internal
OpenSIPS variables, as in the last error above. When they are from the SIP
message, I have verified that the messages themselves are correctly formatted.
This leads me to believe there is memory corruption occurring.
This all started when I updated my load-balancer servers to use Record-Routing,
specifically the “double_rr” mechanism for when multiple interfaces exist. The
Record-Routing is occurring on different servers which have not crashed. Only
the servers receiving the Record-Routed messages are experiencing the errors.
Here is a piece of the code processing sequential requests. I am using the
topology_hiding() functionality of the Dialog module. Are validate_dialog() and
fix_route_dialog() still valid in a topology_hiding scenario?
if (t_check_trans())
setflag(SEQ_REQUEST);
if (has_totag())
{
loose_route();
if (match_dialog())
{
if (!validate_dialog())
fix_route_dialog();
if (is_method("BYE"))
setflag(ACC_FLAG);
setflag(SEQ_REQUEST);
}
else if (!isflagset(SEQ_REQUEST))
{
if (!is_method("ACK")) {
route(rlog, LV_ERROR, "check_sequential", "Sequential request not
matched");
route(reply_error, "481", "Call Does Not Exist");
}
return(EXIT);
}
}
I will attempt to get core dumps of future crashes.
Thanks,
Ben Newlin
_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users