RE: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Shawn Steele via Unicode
So basically this came about because code got bugged for not following the "recommendation." To fix that, the recommendation will be changed. However then that is going to lead to bugs for other existing code that does not follow the new recommendation. I totally get the forward/backward

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Karl Williamson via Unicode
On 05/26/2017 12:22 PM, Ken Whistler wrote: On 5/26/2017 10:28 AM, Karl Williamson via Unicode wrote: The link provided about the PRI doesn't lead to the comments. PRI #121 (August, 2008) pre-dated the practice of keeping all the feedback comments together with the PRI itself in a numbered

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Ken Whistler via Unicode
On 5/26/2017 10:28 AM, Karl Williamson via Unicode wrote: The link provided about the PRI doesn't lead to the comments. PRI #121 (August, 2008) pre-dated the practice of keeping all the feedback comments together with the PRI itself in a numbered directory with the name "feedback.html".

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Karl Williamson via Unicode
On 05/26/2017 04:28 AM, Martin J. Dürst wrote: It may be worth to think about whether the Unicode standard should mention implementations like yours. But there should be no doubt about the fact that the PRI and Unicode 5.2 (and the current version of Unicode) are clear about what they

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Markus Scherer via Unicode
On Fri, May 26, 2017 at 3:28 AM, Martin J. Dürst wrote: > But there's plenty in the text that makes it absolutely clear that some > things cannot be included. In particular, it says > > > The term “maximal subpart of an ill-formed subsequence” refers to the code >

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Philippe Verdy via Unicode
> > Citing directly from the PRI: > > > The term "maximal subpart of the ill-formed subsequence" refers to the > longest potentially valid initial subsequence or, if none, then to the next > single code unit. > > The way i understand it is that C0 80 will have TWO maximal subparts,

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Martin J. Dürst via Unicode
On 2017/05/25 09:22, Markus Scherer wrote: On Wed, May 24, 2017 at 3:56 PM, Karl Williamson wrote: On 05/24/2017 12:46 AM, Martin J. Dürst wrote: That's wrong. There was a public review issue with various options and with feedback, and the recommendation has been