Thanks, looks good so far! I'll report back in a few hours after traffic has picked up again, and let you know how the patched-patched- Kannel is faring :)

Cheers,

On 22/01/2008, at 2:03 AM, Alexander Malysh wrote:

Hi,

you are right. Thanks!

Please try attached patch that combine fixes for both cases.
It's on top of clean smpp-tlv branch.

Giulio Harding wrote:

Also, it looks like there's 2 instances of the same typo, one at
1396, the other at 1460

the 2nd instance at line 1460, should it be something like this?

...
                 if (dlrmsg != NULL) {
                     if (dlrmsg->sms.meta_data == NULL)
                         dlrmsg->sms.meta_data = octstr_create("");
                     meta_data_set_values(dlrmsg->sms.meta_data, pdu-
u.deliver_sm.tlv, "smpp");
                 }
...


On 22/01/2008, at 1:32 AM, Kyriacos Sakkas wrote:

Hi,
    There is some umbiquity in the patch. The new lines are inserted
lower from where the old lines are removed, and specificaly after
an IF
statement that would negate this code block:
                 if (dlrmsg != NULL) {
+                    if (dlrmsg->sms.meta_data == NULL)

This looks like an error in the diff output.

Kyriacos

Alexander Malysh wrote:
Hi,

thanks for very good bug report. Please try attached patch that
should fix:
1) typo ; 2) the case if dlr is not found in the storage.

Just apply it to smpp-tlv branch.

Giulio Harding wrote:


Ok, after a bit more perusing of the code, I think I found the
problem (line 1460 of gw/smsc/smsc_smpp.c):

...
             /* got a deliver ack (DLR)?
                  * NOTE: following SMPP v3.4. spec. we are
interested
                  *       only on bits 2-5 (some SMSC's send 0x44,
and it's
                  *       spec. conforme)
                  */
             if (pdu->u.deliver_sm.esm_class & (0x04|0x08)) {

                 debug("bb.sms.smpp",0,"SMPP[%s] handle_pdu, got
DLR",
                       octstr_get_cstr(smpp->conn->id));

                 dlrmsg = handle_dlr(smpp, pdu-
u.deliver_sm.source_addr, pdu->u.deliver_sm.short_message, pdu-
u.deliver_sm.message_payload,
                                     pdu-
u.deliver_sm.receipted_message_id, pdu-
u.deliver_sm.message_state);
                 if (dlrmsg->sms.meta_data == NULL)
                     dlrmsg->sms.meta_data = octstr_create("");
                 meta_data_set_values(msg->sms.meta_data, pdu-
u.deliver_sm.tlv, "smpp"); /* <----------------------- should be
'dlrmsg->sms.meta_data' */
                 resp = smpp_pdu_create(deliver_sm_resp,
                             pdu->u.deliver_sm.sequence_number);
                 if (dlrmsg != NULL)
                     reason = bb_smscconn_receive(smpp->conn,
dlrmsg);
                 else
                     reason = SMSCCONN_SUCCESS;
                 resp->u.deliver_sm_resp.command_status =
smscconn_failure_reason_to_smpp_status(reason);
             } else {/* MO-SMS */
...


I've changed that line (replaced 'msg' with 'dlrmsg') and rebuilt
Kannel - it seems to be working fine so far, but I'll need some more traffic to know for sure (it's late here, so traffic has dropped off
a bit)

Can someone confirm this problem + fix?

Thanks,

On 21/01/2008, at 11:57 PM, Giulio Harding wrote:


I've been testing Alex's TLV patch with the meta-data branch, with
initial success (able to read and set mblox TLVs for US-specific
bind, as per Kyriacos's mblox TLV config - thanks for that!). It's
been performing perfectly on our test server, with a small number
of test binds, carrying test traffic.

However, when I decided to try deploying that build to production,
bearerbox would crash shortly after startup (after varying delay,
sometimes 1 second, sometimes 10 or so). It would segfault, with no
indication in bearer.log (just a 'Connection closed by the
bearerbox' in smsbox.log) - the only indication was in /var/log/
messages, for example:

Jan 21 23:46:27 smsgw2 kernel: bearerbox[10180]: segfault at
0000000000000118 rip 000000000044c229 rsp 000000005ea30060 error 4

I initially thought it might be a 32-bit/64-bit thing (test server
is 32-bit, production is 64-bit), but I couldn't reproduce the
problem on another 64-bit machine, and recompiling with gcc4 didn't help. (gcc 3.4 apparently can produce incorrect instructions on 64- bit machines in some rare cases??) Turning on debug logging didn't
show anything useful either.

I recompiled kannel with --with-defaults=debug, and tried attaching
gdb to bearerbox to see if I could get anything useful when it
segfaulted - here's the output from gdb:

...
2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP[optusfrmt]: Got PDU:
2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP PDU 0x6bcff60 dump:
2008-01-21 23:59:28 [18588] [49] DEBUG:   type_name: deliver_sm
2008-01-21 23:59:28 [18588] [49] DEBUG:   command_id: 5 =
0x00000005
2008-01-21 23:59:28 [18588] [49] DEBUG:   command_status: 0 =
0x00000000
2008-01-21 23:59:28 [18588] [49] DEBUG:   sequence_number: 1 =
0x00000001
2008-01-21 23:59:28 [18588] [49] DEBUG:   service_type: "NOREP"
2008-01-21 23:59:28 [18588] [49] DEBUG:   source_addr_ton: 1 =
0x00000001
2008-01-21 23:59:28 [18588] [49] DEBUG:   source_addr_npi: 1 =
0x00000001
2008-01-21 23:59:28 [18588] [49] DEBUG:   source_addr: "XXXX"
2008-01-21 23:59:28 [18588] [49] DEBUG:   dest_addr_ton: 2 =
0x00000002
2008-01-21 23:59:28 [18588] [49] DEBUG:   dest_addr_npi: 8 =
0x00000008
2008-01-21 23:59:28 [18588] [49] DEBUG:   destination_addr:
"19774777"
2008-01-21 23:59:28 [18588] [49] DEBUG: esm_class: 4 = 0x00000004
2008-01-21 23:59:28 [18588] [49] DEBUG:   protocol_id: 0 =
0x00000000
2008-01-21 23:59:28 [18588] [49] DEBUG:   priority_flag: 0 =
0x00000000
2008-01-21 23:59:28 [18588] [49] DEBUG:
schedule_delivery_time: NULL
2008-01-21 23:59:28 [18588] [49] DEBUG:   validity_period: NULL
2008-01-21 23:59:28 [18588] [49] DEBUG: registered_delivery: 0 =
0x00000000
2008-01-21 23:59:28 [18588] [49] DEBUG: replace_if_present_flag:
0 = 0x00000000
2008-01-21 23:59:28 [18588] [49] DEBUG:   data_coding: 0 =
0x00000000
2008-01-21 23:59:28 [18588] [49] DEBUG:   sm_default_msg_id: 0 =
0x00000000
2008-01-21 23:59:28 [18588] [49] DEBUG:   sm_length: 122 =
0x0000007a
2008-01-21 23:59:28 [18588] [49] DEBUG:   short_message:
2008-01-21 23:59:28 [18588] [49] DEBUG:    Octet string at
0x6be5f30:
2008-01-21 23:59:28 [18588] [49] DEBUG:      len:  122
2008-01-21 23:59:28 [18588] [49] DEBUG:      size: 123
2008-01-21 23:59:28 [18588] [49] DEBUG:      immutable: 0
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 69 64 3a 31 34
32 37 31 35 39 33 37 36 20 73 75   id:1427159376 su
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 62 3a 30 30 31
20 64 6c 76 72 64 3a 30 30 31 20   b:001 dlvrd:001
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 73 75 62 6d 69
74 20 64 61 74 65 3a 30 38 30 31   submit date:0801
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 32 31 32 33 35
39 20 64 6f 6e 65 20 64 61 74 65   212359 done date
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 3a 30 38 30 31
32 31 32 33 35 39 20 73 74 61 74   :0801212359 stat
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 3a 44 45 4c 49
56 52 44 20 65 72 72 3a 30 30 30   :DELIVRD err:000
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 20 74 65 78 74
3a 43 68 20 37 3a 20 54 68 6e 78    text:Ch 7: Thnx
2008-01-21 23:59:28 [18588] [49] DEBUG:      data: 20 34 20 65 6e
74 65 72 69 6e                      4 enterin
2008-01-21 23:59:28 [18588] [49] DEBUG: Octet string dump ends.
2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP PDU dump ends.
2008-01-21 23:59:28 [18588] [49] DEBUG: SMPP[optusfrmt] handle_pdu,
got DLR
2008-01-21 23:59:28 [18588] [49] DEBUG: DLR[pgsql]: Looking for DLR
smsc=optusfrmt, ts=1427159376, dst=XXXX, type=1
2008-01-21 23:59:28 [18588] [49] DEBUG: sql: SELECT mask, service, url, source, destination, boxc FROM dlr WHERE smsc='optusfrmt' AND
ts='1427159376' LIMIT 1;
2008-01-21 23:59:28 [18588] [49] DEBUG: Found entry, col1=31,
col2=apg, col3=http://apg:8888/dlr/kannel?i=44921898&t=%T&c=% d&m=%
A, col4=19774777, col5=XXXX col6=
2008-01-21 23:59:28 [18588] [49] DEBUG: DLR[pgsql]: created DLR
message for URL <http://apg:8888/dlr/kannel?i=44921898&t=%T&c=%
d&m=%A>
2008-01-21 23:59:28 [18588] [49] DEBUG: removing DLR from database 2008-01-21 23:59:28 [18588] [49] DEBUG: sql: DELETE FROM dlr WHERE
smsc='optusfrmt' AND ts='1427159376';

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1577253216 (LWP 18639)]
0x000000000044ebc1 in handle_pdu (smpp=0x2a95c0e3d0,
conn=0x6b4f8c0,
    pdu=0x6bcff60, pending_submits=0x5e02f0f0) at gw/smsc/
smsc_smpp.c:1460
1460                    meta_data_set_values(msg->sms.meta_data,
pdu->u.deliver_sm.tlv, "smpp");


So, it seems the segfault is triggered by TLV handling for a
certain kind of DLR? (We had DLRs coming in on our test binds, and
that didn't cause a problem). I'm a complete GDB noob, so if
there's anything else I can do to provide more information, please
let me know.

Any ideas why that meta_data_set_value function call would die with
that DLR? Any assistance would be greatly appreciated!

FYI, Kannel details are:

Kannel bearerbox version `cvs-20071018'. Build `Jan 21 2008
23:52:11', compiler `3.4.6 20060404 (Red Hat 3.4.6-9)'. System
Linux, release 2.6.9-55.0.2.ELsmp, version #1 SMP Tue Jun 26
14:14:47 EDT 2007, machine x86_64. Hostname
smsgw2.appgw.mnetcorporation.com, IP 10.110.123.31. Libxml version
2.6.16. Using checking malloc.

Thanks,

--
Giulio Harding
Systems Administrator

m.Net Corporation
Level 2, 8 Leigh Street
Adelaide SA 5000, Australia

Tel: +61 8 8210 2041
Fax: +61 8 8211 9620
Mobile: 0432 876 733
Yahoo: giulio.harding
MSN: [EMAIL PROTECTED]

http://www.mnetcorporation.com




--
Giulio Harding
Systems Administrator

m.Net Corporation
Level 2, 8 Leigh Street
Adelaide SA 5000, Australia

Tel: +61 8 8210 2041
Fax: +61 8 8211 9620
Mobile: 0432 876 733
Yahoo: giulio.harding
MSN: [EMAIL PROTECTED]

http://www.mnetcorporation.com





--
Kyriacos Sakkas
Development Team
Netsmart
Tel: + 357 22 452565
Fax: + 357 22 452566
Email: [EMAIL PROTECTED]
http://www.netsmart.com.cy

Taking Business to a New Level!

** Confidentiality Notice: The information contained in this email
message may be privileged, confidential and protected from disclosure.
If you are not the intended recipient, any dissemination,
distribution,
or copying of this  email message is strictly prohibited.
If you think that you have received this email message in error,
please
email the sender at [EMAIL PROTECTED] **



--
Giulio Harding
Systems Administrator

m.Net Corporation
Level 2, 8 Leigh Street
Adelaide SA 5000, Australia

Tel: +61 8 8210 2041
Fax: +61 8 8211 9620
Mobile: 0432 876 733
Yahoo: giulio.harding
MSN: [EMAIL PROTECTED]

http://www.mnetcorporation.com

--
Thanks,
Alex<smpp-tlv-drl-fix.diff>

--
Giulio Harding
Systems Administrator

m.Net Corporation
Level 2, 8 Leigh Street
Adelaide SA 5000, Australia

Tel: +61 8 8210 2041
Fax: +61 8 8211 9620
Mobile: 0432 876 733
Yahoo: giulio.harding
MSN: [EMAIL PROTECTED]

http://www.mnetcorporation.com



Reply via email to