On Sat, 6 Jan 2024 at 16:57, Lewis Hyatt <lhy...@gmail.com> wrote: > > On Sat, Jan 6, 2024 at 11:40 AM Jonathan Wakely <jwak...@redhat.com> wrote: > > > > Here's a V2 patch which addresses the two things I mentioned: the new > > Python script now generates a complete file that can just be included by > > <bits/unicode.h>, and the full Unicode 15.1.0 grapheme cluster break > > rules are supported (I think ... more testing needed for some of the > > complex rules). > > > > -- >8 -- > > Thanks, by the way, for fixing the typo in gen_wcwidth.py. > One thing I wanted to point out, the file contrib/unicode/README > contains a list of steps to follow in order to update to a new Unicode > version. There are 10 or so steps to generate everything libcpp and > diagnostics care about. Do you think it's worth adding something for > the new libstdc++ parts there too?
Ah, thanks for pointing that out. Yes, I should add to that. > I guess it may not be desirable to > update them always at the same time though. We might not always want to do so e.g. if there's some new grapheme cluster rule that requires code changes in libstdc++ but doesn't need changes in the compiler. But we should at least _try_ to update them in tandem, and see if it works. Ideally the std::lib will use the same Unicode version as the compiler, otherwise you might be able to refer to a new code point in literals using its \N{FOO BAR} name, but the std::lib would not know the properties of that new code point. That wouldn't be a disaster, but certainly a suboptimal user experience.