Hi Tom, > Anyway, interpreting the input as a Unicode code point, for values > above U+7F (or, if you stretch it unreasonably, U+FF) is very clearly > outside the spec.
I'm not sure it is. An unwise design choice by 4.4BSD, yes. U+0081 as 0x81 is ‘is a character representable as an unsigned char’ for it's a character, U+0081, and unsigned char holds [0, 0x100) so it suffers no loss of representation as an unsigned char. Though following that argument, every implementation should be doing it. :-) -- Cheers, Ralph.