Markus Kuhn writes:

> I'm afraid that cat will for the foreseeable future not be one of the
> tools suitable for reading Hebrew text unless it has been stored
> visually or we implement the ECMA/ISO implicit mode in xterm.
> 
> Bidicat essentially exists already in some forms:
> 
> http://czyborra.com/arabjoin/
> 
>      Arabjoin is Roman Czyborra's little Perl tool that takes Arabic UTF-8
>      text (encoded in the U+06xx Arabic block in logical order) as input,
>      performs Arabic glyph joining, and outputs a UTF-8 octet stream that
>      is arranged in visual order. This gives readable results when formatted
>      with a simple Unicode renderer like xterm or yudit that does not
>      handle Arabic differently but simply outputs all glyphs in
>      left-to-right order. 

Don't go that way; you would be reinventing the entire mess with the
three ISO-8859-8 variants (implicit, explicit, visual encoding).

In Unicode and UTF-8, unlike ISO-8859-8, the ordering is always
logical ("implicit" is the old term), not visual. Tools like arabjoin
are hacks outside of the standards. Their right place is inside the
display engine (here: xterm), otherwise applications and xterm must
communicate using malformed anti-Unicode.

Bruno
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to