### Description

Attempting to extract SIP message payload using Python3 KEMI fails with 
exception if the payload contains 8-bit ISO 8859-1 encoded characters.

According to Kamailio docs, KEMI functions return either string (which for 
Python is UTF-8), integer or null, which means 8-bit/binary payloads cannot be 
handled at all? This seems to go against the [SIP 
spec](https://www.rfc-editor.org/rfc/rfc5621.html#section-3.2), which states 
that SIP is 8-bit safe:

>    SIP messages can carry binary message bodies such as legacy
>    signalling objects [[RFC3204](https://www.rfc-editor.org/rfc/rfc3204)].  
> SIP proxy servers are 8-bit safe.
>    That is, they are able to handle binary bodies.  Therefore, there is
>    no need to use encodings such as base64 to transport binary bodies in
>    SIP messages.  Consequently, UAs SHOULD use the binary transfer
>    encoding [[RFC4289](https://www.rfc-editor.org/rfc/rfc4289)] for all 
> payloads in SIP, including binary
>    payloads.  The only case where a UA MAY use a different encoding is
>    when transferring application data between applications that only
>    handle a different encoding (e.g., base64).

### Troubleshooting

#### Reproduction

Attempt to extract SIP message payload e.g. `KSR.pv.get("$msg(body)")` when the 
body has non-ASCII or UTF-8 characters, such as 'é'.

#### Debugging Data

#### Log Messages

(Proprietary specifics redacted)

```
Sep 14 22:40:09 localhost kamailio[772]: 2(65) CRITICAL: \{1 2 MESSAGE 
Oy9m3MMLFwBKK_EQC-WS1g..} <core> [core/kemi.c:136]: sr_kemi_core_crit(): 
Exception during routing: Traceback (most recent call last):
Sep 14 22:40:09 localhost kamailio[772]: File "xxx.py", line xxx, in xxx
Sep 14 22:40:09 localhost kamailio[772]: body = KSR.pv.get("$msg(body)")
Sep 14 22:40:09 localhost kamailio[772]: UnicodeDecodeError: 'utf-8' codec 
can't decode byte 0xe9 in position 0: invalid continuation byte
```

#### SIP Traffic

Full SIP message contains proprietary information. SIP message body for 
demonstration in above log is a single character 'é'). An otherwise identical 
message with body 'e' will instead decode successfully.

### Possible Solutions

### Additional Information

  * **Kamailio Version** - output of `kamailio -v`

```
version: kamailio 5.6.2 (x86_64/linux) 
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, 
USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, 
DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, 
USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, 
TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, 
BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown 
compiled with gcc 9.4.0
```

* **Operating System**:

```
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/";
SUPPORT_URL="https://help.ubuntu.com/";
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/";
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy";
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Linux 31e2ff2f5338 5.15.0-83-generic #92-Ubuntu SMP Mon Aug 14 09:30:42 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux
```

-- 
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/3574
You are receiving this because you are subscribed to this thread.

Message ID: <kamailio/kamailio/issues/[email protected]>
_______________________________________________
Kamailio (SER) - Development Mailing List
To unsubscribe send an email to [email protected]

Reply via email to