On Mon, Sep 01, 2003 at 11:11:16AM -0400, Vadim Vygonets wrote:
> Quoth Tzafrir Cohen on Sun, Aug 31, 2003:
> > Use vim 6. Use dtterm or uxterm. Or build mlterm on your own. With
> > dtterm you have to use a UTF-8 locale (probably en_US-UTF-8). This is
> > something that should work on a standard solaris 8/9 desktop.
> 
> Yes, I'm aware that all this exists.  Still, does sed regexp /./
> match the (two-byte) Hebrew character Aleph in UTF-8?  Until that
> happens, I will not call the support of Unicode in UNIX "native".
> (I don't insist on UTF-8, any encoding of Unicode is fine with
> me.)

A small test (I hope you won't mind the Hebrew):

  $ echo 'שלום' | sed -e 's/של/צדף/'
  צדףום

As you can see, sed treated Hebrew UTF-8 chars just like any other
chars.

Well, almost:

  $ echo 'שלום' |sed -e 's/[י-ת]*/צדף/'
  צדף�ם

It should have given the same output. Indeed the range between the Yud
and the Tav worked, so the regex worked on multibyte Hebrew chars. But
still one character was messed-up after the regex.

And I had hell of a time editing this: I practically couldn't insert
text, because bash calculated internally Hebrew chars as taking two
places (assumed here char==byte).
But this is RedHat 7.3, and the version of bash doesn't support UTF-8
well enough. In RH9 it seems much better. 

$ rpm -q bash glibc sed
bash-2.05a-13
glibc-2.2.5-43
sed-3.02-11
$ locale |grep LC_CTYPE
LC_CTYPE="he_IL.UTF-8"


> 
> > ncurses has a version that supports multi-byte chars: ncursesw. Mutt
> > (and screen) can be built with it. This gtreatly improves the UTF-8
> > capabilities. This is what I use.
> 
> Good to know, thanks.  Will mutt re-code text from anything to
> Unicode?

Yes. (Thus is generally more "sensetive" than most GUI clients to bad
encoding, as overriding bad encoding tends to be a less than trivial
operation)

-- 
Tzafrir Cohen                       +---------------------------+
http://www.technion.ac.il/~tzafrir/ |vim is a mutt's best friend|
mailto:[EMAIL PROTECTED]       +---------------------------+

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to