https://gcc.gnu.org/g:0e634b961123280c17c1c651b1d8b1567b9b523c
commit r17-966-g0e634b961123280c17c1c651b1d8b1567b9b523c Author: Marc Poulhiès <[email protected]> Date: Fri Mar 27 16:29:16 2026 +0100 ada: Fix bug when reading multibyte utf-8 character A multibyte utf-8 character has its msb set, which is the sign bit for a signed value. The get_immediate C function, for linux (and others) uses read() when the character is read from a terminal. It was using a "char" type, so it can be both signed or unsigned (target dependent). On target where char is signed, it means that reading a multibyte utf-8 character will produce a negative value. For example: € = 0xE2 0x82 0xAC The first byte is 0xE2, which is -30 for a signed char. Then the value is written in a signed int, still as -30 (0xFFFF_FFE2), and the caller fails a range check because 0xFFFF_FFE2 is not in the unsigned range for a Character (0..255). Fixing the variable to an unsigned char avoids the conversion to a signed value. gcc/ada/ChangeLog: * sysdep.c (getc_immediate_common): Read character as unsigned value. Diff: --- gcc/ada/sysdep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/ada/sysdep.c b/gcc/ada/sysdep.c index 2185086aeb3e..8f64e7e8484d 100644 --- a/gcc/ada/sysdep.c +++ b/gcc/ada/sysdep.c @@ -394,7 +394,7 @@ getc_immediate_common (FILE *stream, || defined (__Lynx__) || defined (__FreeBSD__) || defined (__OpenBSD__) \ || defined (__GLIBC__) || defined (__APPLE__) || defined (__DragonFly__) \ || defined (__QNX__) - char c; + unsigned char c; int nread; int good_one = 0; int eof_ch = 4; /* Ctrl-D */ @@ -512,7 +512,7 @@ getc_immediate_common (FILE *stream, struct fd_set readFds; /* Timeout before select returns if nothing can be read. */ struct timeval timeOut; - char c; + unsigned char c; int fd = fileno (stream); int nread; int option;
