.
On Sun, Feb 18, 2018 at 11:19 AM, Martin Buchholz <marti...@google.com> wrote: > > - how many digits to consume after the escape? How much do we trust > Unicode to never ever grow beyond 5 hex digits? > Oops, I already got it wrong - it's already at 6 hex digits because there are 17 planes, not 16. MAX_CODE_POINT is U+10FFFF. Yes, we need a variable width syntax like regex \x{h...h} And java regex also supports \N{name} The character with Unicode character name 'name' so we could do the same for the java language. Although it would be a little weird to have every Unicode update make some previously invalid source files valid. We could also say "It's 2018 and UTF-8 has won" and simply use UTF-8 in source files directly. No Unicode escapes needed.