Jens Hatlak:

When I go to http://en.wikipedia.org/wiki/Wikipedia#Coverage with
Firefox 1.5 (built with GTK2) and copy the last paragraph (Ctrl+C),
trying to paste into SciTE (Ctrl+V) has no effect (nothing is pasted)
until I switch to UTF-8 (File/Encoding/8-Bit -> UTF-8). Obviously, it's
because of the dashes.

  The dashes are Em Dashes (U+2014) which has no representation in
the ISO-8859-1 encoding that you are using.

If this change was not supposed to fix my issue, please tell me what to
do.

  The change was only to keyboard input, not to pasting text.

It's quite annoying, and switching to UTF-8 is not really a
solution. If a conversion table is too complicated, dropping
unrepresentable characters is better than pasting nothing at all.

  Yes, it would be good to do something better but doing so will
require some work. Scintilla and SciTE rely on the iconv (or g_iconv)
function to convert between character sets. When iconv can't convert
all of a string it fails. GTK+ 2.x has another function
g_convert_with_fallback which could be used to either replace
nonconvertible characters with either some other text (traditionally
'?') or with Unicode escapes like \u2014.

http://developer.gnome.org/doc/API/2.0/glib/glib-Character-Set-Conversion.html#g-convert-with-fallback

  If anyone is interested in implementing this, GTK+ 1.x still needs
to work so the implementation will need to either work on GTK 1.x or
use the current code there. Not all calls to iconv should be replaced
as I expect that g_convert_with_fallback will be slower and, for
example, conversion to UTF-8 should always succeed.

  Adding an extra conversion table rather than using the platform
leads to maintenance problems and application bloat. Python currently
includes about 800K of character set conversion tables so that it
doesn't have to rely on the platform.

  Neil
_______________________________________________
Scite-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scite-interest

Reply via email to