Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

Jonathan Coxhead via Unicode Mon, 22 May 2017 23:15:30 -0700

On 18/05/2017 1:58 am, Alastair Houghton via Unicode wrote:

On 18 May 2017, at 07:18, Henri Sivonen via Unicode <unicode@unicode.org> wrote:

the decision complicates U+FFFD generation when validating UTF-8 by state 
machine.

It *really* doesn’t.  Even if you’re hell bent on using a pure state machine 
approach, you need to add maybe two additional error states 
(two-trailing-bytes-to-eat-then-fffd and one-trailing-byte-to-eat-then-fffd) on 
top of the states you already have.  The implementation complexity argument is 
a *total* red herring.

Heh. A state machine with N+2 states is, /a fortiori/, more complexthan one with N states. So I think your argument is self-contradictory.

Alastair.

～ʝ

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

Reply via email to