Quoting Simos Xenitellis <[EMAIL PROTECTED]>:
Amarendra Godbole wrote:
Hi,
The output of a command that prints a tabular output (with a
tab separator) is susceptible for a mis-alignment across
different languages. Mostly the headers' get mis-aligned with the
Not just the output of commands that print with tab separators, but
also commands like cal which just put spaces between the days of the
month. In cal, the abbreviated day name headers get misaligned very easily
with locales using complex text layout scripts like Hindi, Thai, etc:
~>LANG=hi_IN.UTF-8 cal
अकटबर 2005
रव सो म ब ग श शन
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
Arabic script locales will also be very problematic.
The presence of such scripts in tab-separated output should also prove
to be problematic
Support for even displaying such scripts is very uneven across different
terminal emulator implementations, if there is support at all.
So, even if one pursues localized solutions
for locales like Japanese, Chinese, Korean, etc., which is quite doable, there
still won't be an adequate answer for Hindi, Bengali, Tibetan, Arabic, and so
on and so on ...
Simos' suggestion to run the command in a POSIX or en_US.UTF-8 locale is
a very reasonable solution in light of the limitations of current terminals
and terminal emulators.
column data in a multi-byte language like Japanese. I have been
thinking of this issue for a while, and here are the possible
solutions to it -
1. Space the columns based on the length of the header. For eg.,
if the column data is ``helloworld", then o/p would be -
head1 header2 headline3
-----------------------------
hello hellowo helloworl
world rld d
Each column wraps. But this approach might break existing
line-by-line parsing scripts.
2. Space the columns based on the longest length of the column
data. This shall need two passes - one to find out the longest
column data, and other to align-and-print the table.
3. Space the columns based on some pre-computation of the change
in lengths of the English and Japanese equivalent string. For
eg., if the Japanese string occupies 40% more columns approx.,
then space the columns accordingly.
4. Leave the issue as-is. :) I have found this approach taken on
HP-UX, where output of df command gets mis-aligned in Japanese
locale.
Can senior folks on this list help me with this? Can there be a
better approach more suitable to i18nized software?? Thanks a lot
in advance.
Would it be an option for you to default to, let's say, the POSIX or
en_US.UTF-8 locales?
Before running the mentioned commands, you can reset on demand the
LANG/LANGAUGE variables to values of your choice.
It looks as a hell of a problem to parse output that is affected by l10n.
Simos
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/