Hi,

thanks for very good bug report. Please try attached patch that should fix:
1) typo ; 2) the case if dlr is not found in the storage.

Just apply it to smpp-tlv branch.

Giulio Harding wrote:

> Ok, after a bit more perusing of the code, I think I found the
> problem (line 1460 of gw/smsc/smsc_smpp.c):
> 
> ...
>              /* got a deliver ack (DLR)?
>                   * NOTE: following SMPP v3.4. spec. we are interested
>                   *       only on bits 2-5 (some SMSC's send 0x44,
> and it's
>                   *       spec. conforme)
>                   */
>              if (pdu->u.deliver_sm.esm_class & (0x04|0x08)) {
> 
>                  debug("bb.sms.smpp",0,"SMPP[%s] handle_pdu, got DLR",
>                        octstr_get_cstr(smpp->conn->id));
> 
>                  dlrmsg = handle_dlr(smpp, pdu-
>  >u.deliver_sm.source_addr, pdu->u.deliver_sm.short_message, pdu-
>  >u.deliver_sm.message_payload,
>                                      pdu-
>  >u.deliver_sm.receipted_message_id, pdu->u.deliver_sm.message_state);
>                  if (dlrmsg->sms.meta_data == NULL)
>                      dlrmsg->sms.meta_data = octstr_create("");
>                  meta_data_set_values(msg->sms.meta_data, pdu-
>  >u.deliver_sm.tlv, "smpp"); /* <----------------------- should be
> 'dlrmsg->sms.meta_data' */
>                  resp = smpp_pdu_create(deliver_sm_resp,
>                              pdu->u.deliver_sm.sequence_number);
>                  if (dlrmsg != NULL)
>                      reason = bb_smscconn_receive(smpp->conn, dlrmsg);
>                  else
>                      reason = SMSCCONN_SUCCESS;
>                  resp->u.deliver_sm_resp.command_status =
> smscconn_failure_reason_to_smpp_status(reason);
>              } else {/* MO-SMS */
> ...
> 
> 
> I've changed that line (replaced 'msg' with 'dlrmsg') and rebuilt
> Kannel - it seems to be working fine so far, but I'll need some more
> traffic to know for sure (it's late here, so traffic has dropped off
> a bit)
> 
> Can someone confirm this problem + fix?
> 
> Thanks,
> 
> On 21/01/2008, at 11:57 PM, Giulio Harding wrote:
> 
>> I've been testing Alex's TLV patch with the meta-data branch, with
>> initial success (able to read and set mblox TLVs for US-specific
>> bind, as per Kyriacos's mblox TLV config - thanks for that!). It's
>> been performing perfectly on our test server, with a small number
>> of test binds, carrying test traffic.
>>
>> However, when I decided to try deploying that build to production,
>> bearerbox would crash shortly after startup (after varying delay,
>> sometimes 1 second, sometimes 10 or so). It would segfault, with no
>> indication in bearer.log (just a 'Connection closed by the
>> bearerbox' in smsbox.log) - the only indication was in /var/log/
>> messages, for example:
>>
>> Jan 21 23:46:27 smsgw2 kernel: bearerbox[10180]: segfault at
>> 0000000000000118 rip 000000000044c229 rsp 000000005ea30060 error 4
>>
>> I initially thought it might be a 32-bit/64-bit thing (test server
>> is 32-bit, production is 64-bit), but I couldn't reproduce the
>> problem on another 64-bit machine, and recompiling with gcc4 didn't
>> help. (gcc 3.4 apparently can produce incorrect instructions on 64-
>> bit machines in some rare cases??) Turning on debug logging didn't
>> show anything useful either.
>>
>> I recompiled kannel with --with-defaults=debug, and tried attaching
>> gdb to bearerbox to see if I could get anything useful when it
>> segfaulted - here's the output from gdb:
>>
>> ...
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP[optusfrmt]: Got PDU:
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP PDU 0x6bcff60 dump:
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   type_name: deliver_sm
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   command_id: 5 = 0x00000005
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   command_status: 0 =
>> 0x00000000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   sequence_number: 1 =
>> 0x00000001
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   service_type: "NOREP"
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   source_addr_ton: 1 =
>> 0x00000001
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   source_addr_npi: 1 =
>> 0x00000001
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   source_addr: "XXXX"
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   dest_addr_ton: 2 =
>> 0x00000002
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   dest_addr_npi: 8 =
>> 0x00000008
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   destination_addr: "19774777"
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   esm_class: 4 = 0x00000004
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   protocol_id: 0 = 0x00000000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   priority_flag: 0 =
>> 0x00000000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   schedule_delivery_time: NULL
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   validity_period: NULL
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   registered_delivery: 0 =
>> 0x00000000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   replace_if_present_flag:
>> 0 = 0x00000000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   data_coding: 0 = 0x00000000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   sm_default_msg_id: 0 =
>> 0x00000000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   sm_length: 122 = 0x0000007a
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:   short_message:
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:    Octet string at 0x6be5f30:
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      len:  122
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      size: 123
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      immutable: 0
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 69 64 3a 31 34
>> 32 37 31 35 39 33 37 36 20 73 75   id:1427159376 su
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 62 3a 30 30 31
>> 20 64 6c 76 72 64 3a 30 30 31 20   b:001 dlvrd:001
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 73 75 62 6d 69
>> 74 20 64 61 74 65 3a 30 38 30 31   submit date:0801
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 32 31 32 33 35
>> 39 20 64 6f 6e 65 20 64 61 74 65   212359 done date
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 3a 30 38 30 31
>> 32 31 32 33 35 39 20 73 74 61 74   :0801212359 stat
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 3a 44 45 4c 49
>> 56 52 44 20 65 72 72 3a 30 30 30   :DELIVRD err:000
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 20 74 65 78 74
>> 3a 43 68 20 37 3a 20 54 68 6e 78    text:Ch 7: Thnx
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 20 34 20 65 6e
>> 74 65 72 69 6e                      4 enterin
>> 2008-01-21 23:59:28 [18588] [49] DEBUG:    Octet string dump ends.
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP PDU dump ends.
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP[optusfrmt] handle_pdu,
>> got DLR
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: DLR[pgsql]: Looking for DLR
>> smsc=optusfrmt, ts=1427159376, dst=XXXX, type=1
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: sql: SELECT mask, service,
>> url, source, destination, boxc FROM dlr WHERE smsc='optusfrmt' AND
>> ts='1427159376' LIMIT 1;
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: Found entry, col1=31,
>> col2=apg, col3=http://apg:8888/dlr/kannel?i=44921898&t=%T&c=%d&m=%
>> A, col4=19774777, col5=XXXX col6=
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: DLR[pgsql]: created DLR
>> message for URL <http://apg:8888/dlr/kannel?i=44921898&t=%T&c=%d&m=%A>
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: removing DLR from database
>> 2008-01-21 23:59:28 [18588] [49] DEBUG: sql: DELETE FROM dlr WHERE
>> smsc='optusfrmt' AND ts='1427159376';
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 1577253216 (LWP 18639)]
>> 0x000000000044ebc1 in handle_pdu (smpp=0x2a95c0e3d0, conn=0x6b4f8c0,
>>     pdu=0x6bcff60, pending_submits=0x5e02f0f0) at gw/smsc/
>> smsc_smpp.c:1460
>> 1460                    meta_data_set_values(msg->sms.meta_data,
>> pdu->u.deliver_sm.tlv, "smpp");
>>
>>
>> So, it seems the segfault is triggered by TLV handling for a
>> certain kind of DLR? (We had DLRs coming in on our test binds, and
>> that didn't cause a problem). I'm a complete GDB noob, so if
>> there's anything else I can do to provide more information, please
>> let me know.
>>
>> Any ideas why that meta_data_set_value function call would die with
>> that DLR? Any assistance would be greatly appreciated!
>>
>> FYI, Kannel details are:
>>
>> Kannel bearerbox version `cvs-20071018'. Build `Jan 21 2008
>> 23:52:11', compiler `3.4.6 20060404 (Red Hat 3.4.6-9)'. System
>> Linux, release 2.6.9-55.0.2.ELsmp, version #1 SMP Tue Jun 26
>> 14:14:47 EDT 2007, machine x86_64. Hostname
>> smsgw2.appgw.mnetcorporation.com, IP 10.110.123.31. Libxml version
>> 2.6.16. Using checking malloc.
>>
>> Thanks,
>>
>> --
>> Giulio Harding
>> Systems Administrator
>>
>> m.Net Corporation
>> Level 2, 8 Leigh Street
>> Adelaide SA 5000, Australia
>>
>> Tel: +61 8 8210 2041
>> Fax: +61 8 8211 9620
>> Mobile: 0432 876 733
>> Yahoo: giulio.harding
>> MSN: [EMAIL PROTECTED]
>>
>> http://www.mnetcorporation.com
>>
>>
>>
> 
> --
> Giulio Harding
> Systems Administrator
> 
> m.Net Corporation
> Level 2, 8 Leigh Street
> Adelaide SA 5000, Australia
> 
> Tel: +61 8 8210 2041
> Fax: +61 8 8211 9620
> Mobile: 0432 876 733
> Yahoo: giulio.harding
> MSN: [EMAIL PROTECTED]
> 
> http://www.mnetcorporation.com

-- 
Thanks,
Alex
=== gw/smsc/smsc_smpp.c
==================================================================
--- gw/smsc/smsc_smpp.c	(revision 312)
+++ gw/smsc/smsc_smpp.c	(local)
@@ -1364,11 +1364,10 @@
                       octstr_get_cstr(smpp->conn->id));
                 dlrmsg = handle_dlr(smpp, pdu->u.data_sm.source_addr, NULL, pdu->u.data_sm.message_payload,
                                     pdu->u.data_sm.receipted_message_id, pdu->u.data_sm.message_state);
-                if (dlrmsg->sms.meta_data == NULL)
-                    dlrmsg->sms.meta_data = octstr_create("");
-                meta_data_set_values(msg->sms.meta_data, pdu->u.data_sm.tlv, "smpp");
-
                 if (dlrmsg != NULL) {
+                    if (dlrmsg->sms.meta_data == NULL)
+                        dlrmsg->sms.meta_data = octstr_create("");
+                    meta_data_set_values(dlrmsg->sms.meta_data, pdu->u.data_sm.tlv, "smpp");
                     /* passing DLR to upper layer */
                     reason = bb_smscconn_receive(smpp->conn, dlrmsg);
                 } else {

Reply via email to