On 2023-07-22, Exim Bugzilla via Exim-dev <exim-dev@lists.exim.org> wrote: > https://bugs.exim.org/show_bug.cgi?id=2998 > > --- Comment #1 from Jeremy Harris <jgh146...@wizmail.org> --- > The patch looks simple, but I can't pretend to understand that bit of > RFC 2279. It seems to be taking about UCS-2 rather than UTF-8. > Is a better description possible?
interestingly that RFC seems to use UCS-2 interchanably with UTF-16 There was an excellent discussion of WTF-8 (like UTF-8 but with surrogates) somewhere on the ineternet (I thought wikipedia, but I can't find it now) https://unicodebook.readthedocs.io/unicode_encodings.html section 7.5. UTF-16 surrogate pairs This bug is mainly motiviated by postgresql only accepting well formed UTF-8. so UTF-8 that encodes uFE01 is rejected and leads to mis-behaviour. -- Jasen. 🇺🇦 Слава Україні -- ## subscription configuration (requires account): ## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/ ## unsubscribe (doesn't require an account): ## exim-dev-unsubscr...@lists.exim.org ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/