Amarendra Godbole wrote:
Hi,
The output of a command that prints a tabular output (with a
tab separator) is susceptible for a mis-alignment across
different languages. Mostly the headers' get mis-aligned with the
column data in a multi-byte language like Japanese. I have been
thinking of this issue for a while, and here are the possible
solutions to it -
1. Space the columns based on the length of the header. For eg.,
if the column data is ``helloworld", then o/p would be -
head1 header2 headline3
-----------------------------
hello hellowo helloworl
world rld d
Each column wraps. But this approach might break existing
line-by-line parsing scripts.
2. Space the columns based on the longest length of the column
data. This shall need two passes - one to find out the longest
column data, and other to align-and-print the table.
3. Space the columns based on some pre-computation of the change
in lengths of the English and Japanese equivalent string. For
eg., if the Japanese string occupies 40% more columns approx.,
then space the columns accordingly.
4. Leave the issue as-is. :) I have found this approach taken on
HP-UX, where output of df command gets mis-aligned in Japanese
locale.
Can senior folks on this list help me with this? Can there be a
better approach more suitable to i18nized software?? Thanks a lot
in advance.
Would it be an option for you to default to, let's say, the POSIX or
en_US.UTF-8 locales?
Before running the mentioned commands, you can reset on demand the
LANG/LANGAUGE variables to values of your choice.
It looks as a hell of a problem to parse output that is affected by l10n.
Simos
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/