-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3624/
-----------------------------------------------------------
(Updated June 25, 2014, 1:39 p.m.)
Review request for Asterisk Developers.
Changes
-------
Pulled in Mark's findings from r3659.
Bugs: ASTERISK-23917
https://issues.asterisk.org/jira/browse/ASTERISK-23917
Repository: Asterisk
Description
-------
*Note:* This issue was originally seen when sending large volumes of data to a
connected ARI client over a websocket, but it theoretically could occur with
any version of Asterisk that uses a websocket. This patch is for Asterisk 11
and only impacts chan_sip - a patch for Asterisk 12 will be on
https://reviewboard.asterisk.org/r/3659.
When a client takes a long time to process information received from Asterisk,
a write operation using fwrite may fail to write all information. This causes
the underlying file stream to be in an unknown state, such that the socket must
be disconnected. Unfortunately, there are two problems with this in Asterisk's
existing websocket code:
1. Periodically, during the read loop, Asterisk must write to the connected
websocket to respond to pings. As such, Asterisk maintains a reference to the
session during the loop. When ast_http_websocket_write fails, it may cause the
session to decrement its ref count, but this in and of itself does not break
the read loop. The read loop's write, on the other hand, does not break the
loop if it fails. This causes the socket to get in a 'stuck' state, preventing
the client from reconnecting to the server.
2. More importantly, however, is that the fwrite in ast_http_websocket_write
fails with a large volume of data when the client takes awhile to process the
information. When it does fail, it fails writing only a portion of the bytes.
With some debugging, it was shown that this was failing in a similar fashion to
ASTERISK-12767. Switching this over to {{ast_careful_fwrite}} with a long
enough timeout solved the problem.
Diffs (updated)
-----
/branches/11/res/res_http_websocket.c 417210
/branches/11/include/asterisk/http_websocket.h 417210
/branches/11/configs/sip.conf.sample 417210
/branches/11/channels/sip/include/sip.h 417210
/branches/11/channels/chan_sip.c 417210
/branches/11/UPGRADE.txt 417210
Diff: https://reviewboard.asterisk.org/r/3624/diff/
Testing
-------
Prior to the patch (using Asterisk 12), sending information to a connected ARI
client for a large number of channels would periodically cause a disconnect.
Once disconnected, the client could not re-connect.
With the patch, the disconnects stopped. By setting the write timeout to a very
low value, the disconnects occurred again, and the client was seen to reconnect
(as the previous socket was completely closed).
Thanks,
Matt Jordan
--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-dev