I thought I might ping this short patch please, just in case it may
make sense to include in GCC 10 along with the other UTF-8-related
fixes to diagnostics. Thanks!



On Thu, Dec 12, 2019 at 6:21 PM Lewis Hyatt <lhy...@gmail.com> wrote:
> Hello-
> In the original discussion of implementing UTF-8 identifiers
> ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224#c26 ), I pointed out
> that colorization would corrupt the appearance of certain diagnostics. For
> example, this code, with -std=c99:
> ----------
> int ٩x;
> ----------
> Produces:
> t2.cpp:1:5: error: extended character ٩ is not valid at the start of an 
> identifier
>     1 | int ٩x;
>       |     ^
> The diagnostic location contains only the first byte of the character, so
> when colorization is enabled, the ANSI escapes are inserted in the middle
> of the UTF-8 sequence and produce corrupted output on the terminal.
> I feel like there are two separate issues here:
> #1. diagnostic_show_locus() should be sure it will not corrupt output in
> this way, regardless of what ranges it is given to work with.
> #2. libcpp should probably generate a range that includes the whole UTF-8
> character. Actually in other ways the range seems not ideal, for example
> if an invalid character appears in the middle of the identifier, the
> diagnostic still points to the first byte of the identifier.
> The attached patch fixes #1. It's essentially a one-line change, plus a
> new selftest. Would you please have a look at it sometime? bootstrap
> and testsuite were done on linux x86-64.
> Other questions that I have:
> - I am not quite clear when a selftest is preferred vs a dejagnu test. In
>   this case I stuck with the selftest because color diagnostics don't seem
>   to work well with dg-error etc, and it didn't seem worth creating a new
>   plugin-based test like g++.dg/plugin just for this. (I also considered
>   using the existing g++.dg plugin, but it seems this test should run for
>   gcc as well.)
> - I wasn't sure if I should create a PR for an issue such as this, if
>   there is already a patch readily available. And if I did create a PR,
>   not sure if it's preferred to post the patch to gcc-patches, or as an
>   attachment to the PR.
> - Does it seem worth me looking into #2? I think the patch to address #1 is
>   appropriate in any case, because it handles generically all potential
>   cases where this may arise, but still perhaps the ranges coming out of
>   libcpp could be improved?
> Thanks...
> -Lewis

Reply via email to