On 23 May 2017, at 07:10, Jonathan Coxhead via Unicode <unicode@unicode.org> wrote: > > On 18/05/2017 1:58 am, Alastair Houghton via Unicode wrote: >> On 18 May 2017, at 07:18, Henri Sivonen via Unicode <unicode@unicode.org> >> wrote: >> >>> the decision complicates U+FFFD generation when validating UTF-8 by state >>> machine. >>> >> It *really* doesn’t. Even if you’re hell bent on using a pure state machine >> approach, you need to add maybe two additional error states >> (two-trailing-bytes-to-eat-then-fffd and one-trailing-byte-to-eat-then-fffd) >> on top of the states you already have. The implementation complexity >> argument is a *total* red herring. > > Heh. A state machine with N+2 states is, a fortiori, more complex than one > with N states. So I think your argument is self-contradictory.
You’re being overly pedantic (and in this case, actually, the cyclomatic complexity of the state machine wouldn’t increase). In any case, Henri is complaining that it’s too difficult to implement; it isn’t. You need two extra states, both of which are trivial. The point I was making was that this is not a strong argument against the proposed change, *even if* we were treating it as a requirement, which it isn’t. Kind regards, Alastair. -- http://alastairs-place.net