Hermann Peifer <[EMAIL PROTECTED]> wrote: > printf \uHHHH is expected to print Unicode chars. This work fine in > most cases, but some legal code points are reported as errors: values > in the ASCII range and C1 control chars, and values between > U+D800..U+DFFF > > I would say that this behaviour is rather a bug than a feature.
Thanks for the report, but this is not some arbitrary restriction, but rather conformance to the standard (C99, ISO/IEC 10646) for "universal character name" syntax: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n717.htm Here's part of printf.c, with a comment that probably came from a version of N717: /* A universal character name shall not specify a character short identifier in the range 00000000 through 00000020, 0000007F through 0000009F, or 0000D800 through 0000DFFF inclusive. A universal character name shall not designate a character in the required character set. */ if ((uni_value <= 0x9f && uni_value != 0x24 && uni_value != 0x40 && uni_value != 0x60) || (uni_value >= 0xd800 && uni_value <= 0xdfff)) error (EXIT_FAILURE, 0, _("invalid universal character name \\%c%0*x"), esc_char, (esc_char == 'u' ? 4 : 8), uni_value); > /usr/bin/printf: invalid universal character name \u0000 > /usr/bin/printf: invalid universal character name \u0001 ... I can understand that you'd find the restriction surprising, but I wouldn't call it a bug. _______________________________________________ Bug-coreutils mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-coreutils
