Patch applied. Current revision is 76.

-----Original Message-----
From: devel [mailto:[email protected]] On Behalf Of Alexander Malysh
Sent: maandag 11 november 2013 18:25
To: [email protected]
Cc: [email protected] org
Subject: Re: [PATCH] opensmppbox: ESME disconnection fix (plain text)

Hi,

similar issue was fixed in bearerbox itself, ++1

Alex

Am 09.11.2013 um 06:26 schrieb [email protected]:

> Greetings.
> 
> There is an issue of sporadic disconnections of some ESMEs by 
> OPENSMPPBOX particularly when operated under high load and a high 
> latency ESME to OPENSMPPBOX network link. Please find the problem 
> explained in detail below. There is a suggested patch (against Revision:
> 75 of OPENSMPPBOX) attached. Kindly review and commit accordingly.
> 
> (A) The problem explained:
> 
> It is tricky to reproduce the bug. However one verified possibility of 
> occurrence is when the following conditions are met:
> 
> 1) The OPENSMPPBOX's client (ESME) sends submit_sm's with a high 
> submission rate. (for example, with a high 'max-pending-submits' 
> setting when using bearerbox as the ESME)
> 2) The high submission rate causes the transmission-layer between the 
> ESME and OPENSMPPBOX to often split a submit_sm PDU across 2 different 
> TCP segments.
> 3) OPENSMPPBOX calls a 'read_pdu' in the 'smpp_to_bearerbox' thread 
> between the arrival of these 2 segments.
> 4) The entire PDU-data is not available and 'smpp_pdu_read_data' call 
> returns NULL. The 'smpp_to_bearerbox' thread does not retain the value 
> of the PDU-length in these situations (although it is strictly 
> required to; refer to comments for 'read_pdu' in the source code)
> 5) The successive 'read_pdu' call forgets about the previously 
> unsuccessful 'smpp_pdu_read_data' call and now attempts to read the 
> 2nd four octets of the original PDU (via the 'smpp_pdu_read_len' call) 
> assuming this to be the 'command_length' field of a newly arrived PDU.
> (Actually what is now read is the "command_id" field of the original
> PDU)
> 6) This attempt to read a wrong value as the PDU-length, causes 
> 'smpp_pdu_read_len' to return -1 if the value read is decoded as less 
> than 'MIN_SMPP_PDU_LEN'. (This will always be the case for submit_sm 
> PDU's since the 'command_id' field is 0x00000004 which decodes as 4 
> and is less than 16)
> 7) Subsequently 'read_pdu' returns -1 and the 'smpp_to_bearerbox' 
> thread shuts down the ESME connection and terminates.
> 
> 
> (B) SUGGESTED PATCH (against Revision: 75 of OPENSMPPBOX):
> 
> --- opensmppbox.c (revision 75)
> +++ opensmppbox.c (working copy)
> @@ -1862,8 +1862,8 @@
>     long len;
> 
>     box->last_pdu_received = time(NULL);
> +    len = 0;
>     while (smppbox_status == SMPP_RUNNING && box->alive) {
> -               len = 0;
>                switch (read_pdu(box, conn, &len, &pdu)) {
>                case -1:
>                        error(0, "Invalid SMPP PDU received.");
> 
> 
> 
> (C) Illustration of a reconnection:
> 
> (C.1) Relevant excerpts from the OPENSMPPBOX log-file with log-level =
> 0:
> 
> 2013-11-08 23:17:58 [20141] [9] DEBUG:   type_name: submit_sm_resp
> 2013-11-08 23:17:58 [20141] [9] DEBUG:   command_id: 2147483652 =
> 0x80000004
> 2013-11-08 23:17:58 [20141] [9] DEBUG:   command_status: 0 = 0x00000000
> 2013-11-08 23:17:58 [20141] [9] DEBUG:   sequence_number: 38101 =
> 0x000094d5
> 2013-11-08 23:17:58 [20141] [9] DEBUG:   message_id:
> 2013-11-08 23:17:58 [20141] [9] DEBUG:    Octet string at
> 0x7fcd0c016200:
> 2013-11-08 23:17:58 [20141] [9] DEBUG:      len:  36
> 2013-11-08 23:17:58 [20141] [9] DEBUG:      size: 37
> 2013-11-08 23:17:58 [20141] [9] DEBUG:      immutable: 0
> 2013-11-08 23:17:58 [20141] [9] DEBUG:      data: 38 36 65 36 38 35 31
> 63 2d 66 64 34 30 2d 34 63   86e6851c-fd40-4c
> 2013-11-08 23:17:58 [20141] [9] DEBUG:      data: 38 62 2d 39 61 39 37
> 2d 31 62 31 39 63 39 61 30   8b-9a97-1b19c9a0
> 2013-11-08 23:17:58 [20141] [9] DEBUG:      data: 38 38 66 30           
>                           88f0
> 2013-11-08 23:17:58 [20141] [9] DEBUG:    Octet string dump ends.
> 2013-11-08 23:17:58 [20141] [9] DEBUG: SMPP PDU dump ends.
> 2013-11-08 23:17:59 [20141] [10] ERROR: SMPP: PDU length was too small 
> (4, minimum is 16).
> 2013-11-08 23:17:59 [20141] [10] ERROR: opensmppbox[user]: Server sent 
> garbage, ignored.
> 2013-11-08 23:17:59 [20141] [10] ERROR: Invalid SMPP PDU received.
> 2013-11-08 23:17:59 [20141] [10] DEBUG: Thread 10
> (opensmppbox.c:smpp_to_bearerbox) terminates.
> 2013-11-08 23:17:59 [20141] [9] DEBUG: Thread 9 
> (opensmppbox.c:function) terminates.
> 2013-11-08 23:18:10 [20141] [0] DEBUG: Started thread 11
> (opensmppbox.c:function)
> 2013-11-08 23:18:10 [20141] [11] DEBUG: Thread 11
> (opensmppbox.c:function) maps to pid 20141.
> 2013-11-08 23:18:10 [20141] [11] INFO: Client connected from 
> <XXX.XXX.XXX.XXX>
> 
> Note:
> (C.1.1) The last submit_sm_resp is being sent for the 'sequence_number:
> 38101'
> (C.1.2) Reading the PDU with sequence_number: 38102 is what causes the 
> ERROR and the subsequent termination of the 'smpp_to_bearerbox' 
> 'Thread 10' (Refer to notes under section (C.2) below)
> (C.1.3) The ESME gets disconnected and initiates a new connection to 
> OPENSMPPBOX.
> 
> 
> (C.2) Relevant excerpts from the captured packets analyzed:
> 
> The Reassembled submit_sm PDU with 'sequence_number: 38102'
> 
> 0000  00 00 00 38 00 00 00 04  00 00 00 00 00 00 94 d6   ...8....
> ........
> 0010  00 05 00 58 58 2d 4b 41  4e 4e 45 4c 00 00 01 31   ...XX-KA
> NNEL...1
> 0020  32 33 34 35 36 37 38 39  30 00 03 00 00 00 00 01   23456789
> 0.......
> 0030  00 00 00 04 54 45 53 54                            ....TEST       
> 
> 
> Note:
> (C.2.1) The 56 byte submit_sm PDU with 'sequence_number: 38102' 
> arrives in 2 parts - 40 bytes as the trailing part of the 1st 
> segment's payload and 16 bytes in the leading part of the second segment's
payload.
> (C.2.2) The delay in arrival of the 2nd segment makes PDU-data 
> unavailable to be read in its entirety. The original 'command_length'
> field 0x00000038 is discarded.
> (C.2.3) The 2nd four octets 0x00000004 (which is in fact the original 
> 'command_id' field) is now read in as the PDU-length causing the error.
> 
> 
> P.S: This message is being posted to the KANNEL developers' mailing 
> list directly because the registration process at redmine.kannel.org 
> failed to sent a confirmation email.
> 
> Thanks and Best Regards
> 
> Juno Lopez
> Director - R&D
> 
> APRICOT
> Trivandrum, India
> http://apricot.net.in
> 
> <patch_wrt_75.txt>




Reply via email to