Amarendra Godbole wrote:

Hi,

The output of a command that prints a tabular output (with a
tab separator) is susceptible for a mis-alignment across
different languages. Mostly the headers' get mis-aligned with the
column data in a multi-byte language like Japanese. I have been
thinking of this issue for a while, and here are the possible
solutions to it -

1. Space the columns based on the length of the header. For eg.,
  if the column data is ``helloworld", then o/p would be -
head1    header2    headline3
-----------------------------
hello    hellowo    helloworl
world    rld        d

  Each column wraps. But this approach might break existing
  line-by-line parsing scripts.

2. Space the columns based on the longest length of the column
  data. This shall need two passes - one to find out the longest
  column data, and other to align-and-print the table.

3. Space the columns based on some pre-computation of the change
  in lengths of the English and Japanese equivalent string. For
  eg., if the Japanese string occupies 40% more columns approx.,
  then space the columns accordingly.

4. Leave the issue as-is. :) I have found this approach taken on
  HP-UX, where output of df command gets mis-aligned in Japanese
  locale.

Can senior folks on this list help me with this? Can there be a
better approach more suitable to i18nized software?? Thanks a lot
in advance.
Would it be an option for you to default to, let's say, the POSIX or en_US.UTF-8 locales? Before running the mentioned commands, you can reset on demand the LANG/LANGAUGE variables to values of your choice.
It looks as a hell of a problem to parse output that is affected by l10n.

Simos

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to