Jens Hatlak:
When I go to http://en.wikipedia.org/wiki/Wikipedia#Coverage with Firefox 1.5 (built with GTK2) and copy the last paragraph (Ctrl+C), trying to paste into SciTE (Ctrl+V) has no effect (nothing is pasted) until I switch to UTF-8 (File/Encoding/8-Bit -> UTF-8). Obviously, it's because of the dashes.
The dashes are Em Dashes (U+2014) which has no representation in the ISO-8859-1 encoding that you are using.
If this change was not supposed to fix my issue, please tell me what to do.
The change was only to keyboard input, not to pasting text.
It's quite annoying, and switching to UTF-8 is not really a solution. If a conversion table is too complicated, dropping unrepresentable characters is better than pasting nothing at all.
Yes, it would be good to do something better but doing so will require some work. Scintilla and SciTE rely on the iconv (or g_iconv) function to convert between character sets. When iconv can't convert all of a string it fails. GTK+ 2.x has another function g_convert_with_fallback which could be used to either replace nonconvertible characters with either some other text (traditionally '?') or with Unicode escapes like \u2014. http://developer.gnome.org/doc/API/2.0/glib/glib-Character-Set-Conversion.html#g-convert-with-fallback If anyone is interested in implementing this, GTK+ 1.x still needs to work so the implementation will need to either work on GTK 1.x or use the current code there. Not all calls to iconv should be replaced as I expect that g_convert_with_fallback will be slower and, for example, conversion to UTF-8 should always succeed. Adding an extra conversion table rather than using the platform leads to maintenance problems and application bloat. Python currently includes about 800K of character set conversion tables so that it doesn't have to rely on the platform. Neil _______________________________________________ Scite-interest mailing list [email protected] http://mailman.lyra.org/mailman/listinfo/scite-interest
