On 5/16/2017 10:30 AM, Shawn Steele via
Unicode wrote:
Would you advocate replacinge0 80 80withU+FFFD U+FFFD U+FFFD (1)rather thanU+FFFD (2)It’s pretty clear what the intent of the encoder was there, I’d say, and while we certainly don’t want to decode it as a NUL (that was the source of previous security bugs, as I recall), I also don’t see the logic in insisting that it must be decoded to *three* code points when it clearly only represented one in the input.It is not at all clear what the intent of the encoder was - or even if it's not just a problem with the data stream. E0 80 80 is not permitted, it's garbage. An encoder can't "intend" it.Either A) the "encoder" was attempting to be malicious, in which case the whole thing is suspect and garbage, and so the # of FFFD's doesn't matter, or B) the "encoder" is completely broken, in which case all bets are off, again, specifying the # of FFFD's is irrelevant. C) The data was corrupted by some other means. Perhaps bad concatenations, lost blocks during read/transmission, etc. If we lost 2 512 byte blocks, then maybe we should have a thousand FFFDs (but how would we known?) -Shawn Clearly, for the receiver, nothing reliable can be deduced about the raw byte stream once an FFFD has been inserted. For the receiver, there's a fourth case that
might have been: D) the raw UTF-8 stream contained a valid U+FFFD |
- Re: Feedback on the proposal to change... Asmus Freytag via Unicode
- Re: Feedback on the proposal to ch... Alastair Houghton via Unicode
- Re: Feedback on the proposal t... David Starner via Unicode
- Re: Feedback on the proposal t... Alastair Houghton via Unicode
- Re: Feedback on the proposal t... David Starner via Unicode
- Re: Feedback on the proposal t... Alastair Houghton via Unicode
- Re: Feedback on the proposal t... Philippe Verdy via Unicode
- RE: Feedback on the proposal t... Shawn Steele via Unicode
- Re: Feedback on the proposal t... Richard Wordingham via Unicode
- RE: Feedback on the proposal t... Shawn Steele via Unicode
- Re: Feedback on the proposal t... Asmus Freytag via Unicode
- Re: Feedback on the proposal t... Philippe Verdy via Unicode
- Re: Feedback on the proposal to ch... Henri Sivonen via Unicode
- Re: Feedback on the proposal t... Alastair Houghton via Unicode
- Re: Feedback on the proposal t... Henri Sivonen via Unicode
- Re: Feedback on the proposal t... Philippe Verdy via Unicode
- Re: Feedback on the proposal t... Martin J. Dürst via Unicode
- Re: Feedback on the proposal t... Richard Wordingham via Unicode
- Re: Feedback on the proposal to change... Henri Sivonen via Unicode
- Re: Feedback on the proposal to change... J Decker via Unicode
- Re: Feedback on the proposal to change U+FF... Alastair Houghton via Unicode