Package: www.debian.org Followup-For: Bug #959474 Hi,
After a bit of investigation of Perl source code (5.31.11 downloaded from upstream) I found the they have weird handling of whitespace when `feature unicode_strings` turned on. I am not a perl person and I haven't executed the source code yet, so my interpretation might be wrong. When `unicode_strings` is on, `in_uni_8_bit` should true internally, and in three places of pp.c:6040, pp.c:6076, pp.c:6114 `isSPACE_L1` is called to check whether the examining character is a whitespace, by checking whether the character is 0x85 or 0xA0 (handy.h:1611). In the case of the character 包, the last byte of 3-byte UTF-8 code is 0x85, henceforth the problem. -- System Information: Debian Release: bullseye/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: amd64 (x86_64) Kernel: Linux 5.6.0-1-amd64 (SMP w/8 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled
signature.asc
Description: PGP signature