https://bugs.kde.org/show_bug.cgi?id=471483
Bug ID: 471483 Summary: Problems with C1 control codes (U+0080 through U+009F) Classification: Applications Product: konsole Version: 22.12.3 Platform: Debian stable OS: Linux Status: REPORTED Severity: normal Priority: NOR Component: emulation Assignee: konsole-de...@kde.org Reporter: f.heckenb...@fh-soft.de Target Milestone: --- Konsole recently (apparently between versions 20 and 22) added support for 8-bit C1 control codes (U+0080 through U+009F). While formally correct, in practice it seems to cause more problems than benefits: On the one hand, I don't know any application that actually outputs these characters. Wikipedia (https://en.wikipedia.org/wiki/C0_and_C1_control_codes) seems to agree: "the 8-bit forms of these codes are almost never used. CSI, DCS and OSC are used to control text terminals and terminal emulators, but almost always by using their 7-bit escape code representations." On the other hand, they can actively cause problems (which contributed to their not being used much). In previous times, there were issues in not 8-bit-clean environments; these days rather with UTF-8. To quote Wikipedia again, "the UTF-8 encodings of their corresponding codepoints are two bytes long like their escape code forms (for instance, CSI at U+009B is encoded as the bytes 0xC2, 0x9B in UTF-8), so there is no advantage to using them rather than the equivalent two-byte escape sequence. When these codes appear in modern documents, web pages, e-mail messages, etc., they are usually intended to be printing characters at that position in a proprietary encoding such as Windows-1252 or Mac OS Roman that use the C1 codes to provide additional graphic characters." ... or, I'd like to add, mojibake. E.g. the German letter "ß" is U+00DF with UTF-8 encoding 0xC3 0x9F. I had a long-running program (with UTF-8 output) in a Konsole window set to ISO-8859-1 accidentally, and from the first occurrence of that letter, Konsole waited for the end of the supposedly APC sequence which never came, so it swallowed all further output including probably some important messages from the program. Sure, mojibake is not nice in general, but for languages with few non-ASCII characters such as German, quite tolerable. Swallowing all output makes matters much worse. So I'd suggest to add at least an option to disable their handling. STEPS TO REPRODUCE 1. Set encoding to ISO-8859-1 in Konsole window 2. Run in that window (this should be independent of shells and locale settings, though UTF-8 locale must be installed): LC_ALL=C.UTF-8 /usr/bin/printf 'Gro\u00df\n'; echo Good OBSERVED RESULT Groà (Output cut off and window "dead", or possibly revived by control characters in shell prompt.) EXPECTED RESULT GroÃ? Good % (Mojibake in first line, but second line correct.) -- You are receiving this mail because: You are watching all bug changes.