Hello the list!

I have a simple question. How should we treat invalid/forbidden byte 
sequences in mrxvt? This means some sequence of byte which is badly 
formed, not to be mistaken with unsupported byte sequence (for instance 
unsupported in your font: you have no corresponding "image" for this 
character; yet it still exists!).

- For the latter case (unsupported), there is a clear behaviour to 
follow: http://unicode.org/faq/unsup_char.html
In most case, it would mean just displaying the "missing glyph" (in 
fonts, it is often some kind of square).

- But for the former (invalid), I don't know how to behave. The Utf-8 
standard forbids their interpretation: 
http://www.unicode.org/versions/corrigendum1.html
 From what I read, it means mainly not to try and fix the character 
("guess" what was the expected character to send). This I fully agree. 
But this tells nothing about displaying (and I could not find a page 
discussing this point). Should I just ignore/dump this sequence of byte 
and go to the next valid byte for display (with a debug message)? Or 
should I display "something" to tell there was some unprocessable character?

I would of course think to display the same "replacement/missing glyph" 
as for unsupported character, but its meaning is really about characters 
that exists but cannot be displayed (missing font character mostly), not 
about invalid: http://www.unicode.org/glossary/#replacement_glyph

So what is your opinion? Just ignoring invalid sequences is the easier 
to do, and I think this is the right choice. But I am not sure because I 
think the "missing glyph" is a nice way to tell our users "here there is 
a problem to fix".
Of course, I give links to the utf-8 standard, but in fact this behavior 
about invalid sequences will be used for all other encoding as well.

Jehan

P.S.: of course, normally this should never happen anyway. But we don't 
code the program which are run as child of mrxvt (bash or others), so we 
don't know and should be prepared to a bug in these program if they 
don't support as well locale encoding as we will soon! :p

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Materm-devel mailing list
Materm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/materm-devel
mrxvt home page: http://materm.sourceforge.net

Reply via email to