Re: UTF-8 man pages

2007-09-10 Thread Colin Watson
On Fri, Aug 10, 2007 at 09:24:32PM +0100, David Given wrote:
 I'm trying to package a simple tool that wants a Japanese string in its man
 page. It would appear that currently, man pages use fixed encodings that vary
 depending on which locale's man page is being looked up; English uses
 ISO-8859-1, so it's not possible to use kanji in one.
 
 Various people on -mentors suggested that this was wrong as there was a plan
 in place to convert to using UTF-8 throughout, and that I should bring this up
 here; I can't find any references to such a plan on the 'net --- is there one?
 What's its status? And what should I do to get my man page working?

Belatedly, I'd like to point this list at:

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=440420

... though it won't help you as such just yet due to the caveat I
mention at the end of my transition plan, but it's part of the process
of getting there from here.

Cheers,

-- 
Colin Watson   [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: UTF-8 man pages

2007-08-12 Thread Mike Hommey
On Sun, Aug 12, 2007 at 09:09:07AM +0900, Osamu Aoki [EMAIL PROTECTED] wrote:
 Hi,
 
 On Sat, Aug 11, 2007 at 06:10:53PM +0100, David Given wrote:
  Roger Leigh wrote:
  [...]
   I would personally like to see this happen, but until it does we are
   limited (I believe) to the glyphs described in groff_char(7).  I am
   not aware of any Japanese support at all except in specially-patched
   versions.
  
  I do know that Debian uses EUC-JP encoded man pages if you're in the 
  Japanese
  locale, so multibyte support does work (install man-db and do:
  
man -l /usr/share/man/ja/man1/manpath.1.gz
  
  ), but that doesn't help me much in my English locale.
 
 Yes.
 
 Basically, unless you push restructuring of man in Debian, you are out
 of lack, I think.  Please think about documenting in README.Debian,
 README.UTF-8, HTML or somthing other than ...
 
 For HTML if Japanese text is short, embeding gif/png file is better than
 using UTF-8 characters.  Then you can read it from any configuration.

Erm, for HTML, it'd be better to use entities, such as #x65E5;#x672C;

Mike


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: UTF-8 man pages

2007-08-12 Thread Russ Allbery
Steinar H. Gunderson [EMAIL PROTECTED] writes:
 On Sun, Aug 12, 2007 at 09:09:07AM +0900, Osamu Aoki wrote:

 For HTML if Japanese text is short, embeding gif/png file is better
 than using UTF-8 characters.  Then you can read it from any
 configuration.

 ...except a text-mode one?

My text-mode configuration displays Japanese text if it's encoded in
UTF-8.

-- 
Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: UTF-8 man pages

2007-08-12 Thread Russ Allbery
Russ Allbery [EMAIL PROTECTED] writes:
 Steinar H. Gunderson [EMAIL PROTECTED] writes:
 On Sun, Aug 12, 2007 at 09:09:07AM +0900, Osamu Aoki wrote:

 For HTML if Japanese text is short, embeding gif/png file is better
 than using UTF-8 characters.  Then you can read it from any
 configuration.

 ...except a text-mode one?

 My text-mode configuration displays Japanese text if it's encoded in
 UTF-8.

Bleh, never mind.  I misread the original message.  Sorry about that,
everyone.

-- 
Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: UTF-8 man pages

2007-08-11 Thread Roger Leigh
David Given [EMAIL PROTECTED] writes:

 I'm trying to package a simple tool that wants a Japanese string in its man
 page. It would appear that currently, man pages use fixed encodings that vary
 depending on which locale's man page is being looked up; English uses
 ISO-8859-1, so it's not possible to use kanji in one.

 Various people on -mentors suggested that this was wrong as there was a plan
 in place to convert to using UTF-8 throughout, and that I should bring this up
 here; I can't find any references to such a plan on the 'net --- is there one?
 What's its status? And what should I do to get my man page working?

Unless things have changed recently, groff is still fixed to using
8-bit encodings.  Until it can actually process UTF-8 input, using
UTF-8 encoding would (IMO) create more problems than it solves.

I would personally like to see this happen, but until it does we are
limited (I believe) to the glyphs described in groff_char(7).  I am
not aware of any Japanese support at all except in specially-patched
versions.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


pgpamaZT06IUj.pgp
Description: PGP signature


Re: UTF-8 man pages

2007-08-11 Thread David Given
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Roger Leigh wrote:
[...]
 I would personally like to see this happen, but until it does we are
 limited (I believe) to the glyphs described in groff_char(7).  I am
 not aware of any Japanese support at all except in specially-patched
 versions.

I do know that Debian uses EUC-JP encoded man pages if you're in the Japanese
locale, so multibyte support does work (install man-db and do:

  man -l /usr/share/man/ja/man1/manpath.1.gz

), but that doesn't help me much in my English locale.

- --
┌── dg@cowlark.com ─── http://www.cowlark.com ───
│
│ There does not now, nor will there ever, exist a programming language in
│ which it is the least bit hard to write bad programs. --- Flon's Axiom
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGve2df9E0noFvlzgRAgZkAKCX5G1VEBhExnOaCnK3rY/6ugdUtwCgoSl1
OuDvvxdikIQYB73G2dD+hXc=
=+Ht0
-END PGP SIGNATURE-



Re: UTF-8 man pages

2007-08-11 Thread Osamu Aoki
Hi,

On Sat, Aug 11, 2007 at 06:10:53PM +0100, David Given wrote:
 Roger Leigh wrote:
 [...]
  I would personally like to see this happen, but until it does we are
  limited (I believe) to the glyphs described in groff_char(7).  I am
  not aware of any Japanese support at all except in specially-patched
  versions.
 
 I do know that Debian uses EUC-JP encoded man pages if you're in the Japanese
 locale, so multibyte support does work (install man-db and do:
 
   man -l /usr/share/man/ja/man1/manpath.1.gz
 
 ), but that doesn't help me much in my English locale.

Yes.

Basically, unless you push restructuring of man in Debian, you are out
of lack, I think.  Please think about documenting in README.Debian,
README.UTF-8, HTML or somthing other than ...

For HTML if Japanese text is short, embeding gif/png file is better than
using UTF-8 characters.  Then you can read it from any configuration.

 --
 ┌── dg@cowlark.com ─── http://www.cowlark.com ───
 │
 │ There does not now, nor will there ever, exist a programming language in
 │ which it is the least bit hard to write bad programs. --- Flon's Axiom
 
 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: UTF-8 man pages

2007-08-11 Thread Steinar H. Gunderson
On Sun, Aug 12, 2007 at 09:09:07AM +0900, Osamu Aoki wrote:
 For HTML if Japanese text is short, embeding gif/png file is better than
 using UTF-8 characters.  Then you can read it from any configuration.

...except a text-mode one?

/* Steinar */
-- 
Homepage: http://www.sesse.net/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



UTF-8 man pages

2007-08-10 Thread David Given
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I'm trying to package a simple tool that wants a Japanese string in its man
page. It would appear that currently, man pages use fixed encodings that vary
depending on which locale's man page is being looked up; English uses
ISO-8859-1, so it's not possible to use kanji in one.

Various people on -mentors suggested that this was wrong as there was a plan
in place to convert to using UTF-8 throughout, and that I should bring this up
here; I can't find any references to such a plan on the 'net --- is there one?
What's its status? And what should I do to get my man page working?

- --
┌── dg@cowlark.com ─── http://www.cowlark.com ───
│
│ There does not now, nor will there ever, exist a programming language in
│ which it is the least bit hard to write bad programs. --- Flon's Axiom
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGvMl/f9E0noFvlzgRAioVAJwMML+AIgAHL/rqeAM2NTuwAt4w0ACgvMHi
kRU0y2BztTvV0BXQNmi80+8=
=0pqp
-END PGP SIGNATURE-