Excerpts from Matti Eiden's message of 2010-05-06 14:02:46 -0400: > Hey folks, > > I've been experimenting with sup for the past few days, and of course, > I love it. Firstly I had some trouble with getting unicode display > going. This problem was already described in an old post on this > mailing list: > > http://rubyforge.org/pipermail/sup-devel/2010-March/000522.html > > So Arch Linux defines encoding as utf8, but Iconv requires it to be > UTF-8. I would say this is a bug in Arch Linux for not following > standards, but anyway, I fixed it with the little modification to > sup.rb: > > ## determine encoding and character set > $encoding = Locale.current.charset > $encoding = "UTF-8" if $encoding == "utf8"
I've applied this fix, thanks. > Then about wide character support. And I mean really wide. Like CJK > characters. Scandics (ä,ö,å) and other European accent characters work > nicely, as we all who are concerned probably know. These characters > have a byte length of 2 and unicode length of 1. > > However, take an example of the following two-character Korean word > (byte length of such single character is 3 instead of 2!) > > http://www.kotiposti.net/eiden/soulbound/hellovim.png (looking good in vim) > http://www.kotiposti.net/eiden/soulbound/hellosup.png (sup lost 2 > characters (or bytes) from the line that has the Korean word) > > It seems that for every Korean character with a byte length of 3, one > byte is lost from the end of the line. In the above example, two bytes > are missing in sup, as there are two Korean characters on the same > line. > > If the line consist of a single Korean character, nothing appears in > sup (last byte out of three is missing?). > If the line consist of two Korean characters, last character is > missing (last two bytes out of six are missing?). > etc. > > Some sort of miscalculation somewhere is causing this, perhaps > assuming that unicode characters always have a byte length of 2? Can > anybody with Ruby skills take a look on this? It's actually the multiple screen cells that causes problems, not multiple bytes [1]. Sup currently thinks all characters are 1 cell wide. The right thing is probably a C extension that uses wcswidth. [1] http://mid.gmane.org/1264629880-sup-9232%40zyrg.net _______________________________________________ Sup-devel mailing list Sup-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/sup-devel