Quoth Tzafrir Cohen on Mon, Sep 01, 2003: > A small test (I hope you won't mind the Hebrew):
[snip -- can't do Hebrew ATM] > It should have given the same output. Indeed the range between the Yud > and the Tav worked, so the regex worked on multibyte Hebrew chars. No it didn't. It replaced vav, which is not between yud and tav. I tried replacing the range yud-to-lamed, and it happily gave me the same output (i.e., it replaced shin as well). Something is wrong here; and if you think for a second how sed works and how UTF-8 is encoded, you will immediately see what it is. Try to do "| sed s/....../foo/" and see what happens -- you will get "fooM", where M is mem sofit. > But still one character was messed-up after the regex. That too. > And I had hell of a time editing this: I practically couldn't insert > text, because bash calculated internally Hebrew chars as taking two > places (assumed here char==byte). I used mlterm to test it, and my zsh had problems as well. (mlterm 2.7.0, zsh 4.0.6, FreeBSD 4.8-STABLE) > But this is RedHat 7.3, and the version of bash doesn't support UTF-8 > well enough. In RH9 it seems much better. That's exactly what I'm talking about. That thing supports this encoding, this thing doesn't, and what you have *in the end* is a system which, in some rare situations, can take Unicode text and deal with it, but mostly it can't. The assumption of single-byte characters shines through, and if you're not careful it bites you. > > Good to know, thanks. Will mutt re-code text from anything to > > Unicode? > > Yes. (Thus is generally more "sensetive" than most GUI clients to bad > encoding, as overriding bad encoding tends to be a less than trivial > operation) You lost me here. What do you mean by overriding bad encoding, and what do other apps do? Vadik. -- Prof: So the American government went to IBM to come up with a data encryption standard and they came up with ... Student: EBCDIC! ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]