On 2023-07-22, Exim Bugzilla via Exim-dev <exim-dev@lists.exim.org> wrote:
> https://bugs.exim.org/show_bug.cgi?id=2998
>
> --- Comment #1 from Jeremy Harris <jgh146...@wizmail.org> ---
> The patch looks simple, but I can't pretend to understand that bit of
> RFC 2279.  It seems to be taking about UCS-2 rather than UTF-8.
> Is a better description possible?

interestingly that RFC seems to use UCS-2 interchanably with UTF-16


There was an excellent discussion of WTF-8 (like UTF-8 but with
surrogates) somewhere on the ineternet (I thought wikipedia, but I
can't find it now)


https://unicodebook.readthedocs.io/unicode_encodings.html
section 7.5. UTF-16 surrogate pairs

This bug is mainly motiviated by postgresql only accepting well formed
UTF-8. so UTF-8 that encodes uFE01 is rejected and leads to
mis-behaviour.


-- 
 Jasen.
 🇺🇦 Слава Україні

-- 
## subscription configuration (requires account):
##   https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
##   exim-dev-unsubscr...@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/

Reply via email to