Markus Scherer wrote: > Dominikus Scherkl wrote: > > My other suggestion (and the main reason to call the proposed > > charakter "source failure indicator symbol" (SFIS)) was intended > > especaly for mall-formed utf-8 input that has overlong encodings. > This is a special, custom form of error handling - why assign > a character for it? Converting from and to utf-8 is an all-day topic, very important for all applications handling with unicode. So it is a special case, but very common. Therefore it would be nice to have a standardized - application independend - error handling for it. Also it is a mechanism useful for many other charsets beeing converted do unicode.
> You could just use an existing character or non-character for > this, e.g., U+303E or U+FFFF or U+FDEF or similar. This is what I do meanwhile. But it's uncomfortable, because most editors display all non-characters, unassigned characters or charakters not in the font all the same way - which hides the INDICATION. The SFIS should be displayed to remind the reader only THIS is a SFIS unlike all the other empty suqares in the text. Additional I think we should have a standardized way to display old utf-8 text without losing information (overlong utf-8 was allowed for years) - gyphing is not a fine way and simply decoding the overlong forms is not allowed. This is a self-made problem, so unicode should provide an inherent way to solve it. Best regards -- Dominikus Scherkl [EMAIL PROTECTED]

