Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

Alastair Houghton via Unicode Tue, 23 May 2017 02:23:16 -0700

On 23 May 2017, at 07:10, Jonathan Coxhead via Unicode <unicode@unicode.org> 
wrote:
> 
> On 18/05/2017 1:58 am, Alastair Houghton via Unicode wrote:
>> On 18 May 2017, at 07:18, Henri Sivonen via Unicode <unicode@unicode.org>
>>  wrote:
>> 
>>> the decision complicates U+FFFD generation when validating UTF-8 by state 
>>> machine.
>>> 
>> It *really* doesn’t.  Even if you’re hell bent on using a pure state machine 
>> approach, you need to add maybe two additional error states 
>> (two-trailing-bytes-to-eat-then-fffd and one-trailing-byte-to-eat-then-fffd) 
>> on top of the states you already have.  The implementation complexity 
>> argument is a *total* red herring.
> 
> Heh. A state machine with N+2 states is, a fortiori, more complex than one 
> with N states. So I think your argument is self-contradictory.


You’re being overly pedantic (and in this case, actually, the cyclomatic 
complexity of the state machine wouldn’t increase).  In any case, Henri is 
complaining that it’s too difficult to implement; it isn’t.  You need two extra 
states, both of which are trivial.

The point I was making was that this is not a strong argument against the 
proposed change, *even if* we were treating it as a requirement, which it isn’t.

Kind regards,

Alastair.

--
http://alastairs-place.net

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

Reply via email to