Re: how can i use ISO-8859-1??

2008-09-10 Thread Roland Smith
On Tue, Sep 09, 2008 at 04:08:40PM -0700, Gary Kline wrote:
 On Wed, Sep 10, 2008 at 12:39:41AM +0200, Roland Smith wrote:
  On Tue, Sep 09, 2008 at 03:16:08PM -0700, Gary Kline wrote:
Because it is a hiddeous waste for most readers and writers of
English and other European languages.
  
 I also argured that utf-8 was a waste of a whole byte per char
 for most of us.
  
  That's not true. UTF-8 is a variable-length encoding. It is backwards
  compatible with ASCII, i.e. ascii characters are one byte in UTF-8 as
  well. Are you thinking about UTF-16?
 
 
   I don't know.  (Mark Twain.)  Back in the late 1990's I was
   assigned the project of converting all the utilities I had ported
   to three European languages.  Until now I had no idea there was
   anything *but* utf-16, i.e. 2-bytes/char.  

Both UTF-8 and UTF-16 are variable-width encodings. 
 
   With memory seriously getting to be dirt-cheap, wasting 8-bits
   doesn't seem that big a deal. 

Indeed.

Maybe some future wizard will
   invent a UTF-32 that will hold all ~90 000 Chinese characters and
   these will be downsized automatically to UTF-8 when you're mixing
   Mandarin with, say, Cesk [Czeck].  

UTF-32 already exists, but it's a fixed-width (4 bytes) encoding.

   Hmm, somebody just told me that aigu is not English but French
   and means acute.  ...all these years i thought ... oh well.
   Anyway, do you know if '\0351' is a 16-bit character?  is is 0xE9
   and decimal 233 and certaing should fit into a byte.   just
   wondering.

Obviously it is a 8-bit character; anything in the range 0-255 is. In
ISO 8859-1(5) it is é (e with accent aigu).

Please look up UTF-8,16,32 and ISO-8859-15 on Wikipedia for further
enlightenment.

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpmTzlhyO5ig.pgp
Description: PGP signature


Re: how can i use ISO-8859-1??

2008-09-09 Thread Roland Smith
On Mon, Sep 08, 2008 at 09:35:07PM -0700, Gary Kline wrote:
   Guys,
 
   This is one of the I've-been-meaning-to-ask questions;
   but other things keep happening that took precedence.  Now
   it's time to ask what are the voodoo commands to set up in my
   ~/.zshrc or other initiation files (probably including my muttrc)
   that will let me print to stdout, characters like the e-aigu
   or u-umlaut and the currency pound or Euro?
 
Why settle for ISO-8859-1? Switch to UTF-8 instead, wich can display a
much larger number of characters, and is becoming the standard.

I added the following to the 'setenv' section of the 'default' profile
in login.conf:

   LC_ALL=en_US.UTF-8

AFAICT, the console doesn't have UTF-8 fonts (yet?). But that doesn't
bother me because I always use X anyway.

So I added the following to my ~/.xinitrc as well:

  export LANG=en_US.UTF-8

I installed the rxvt-unicode terminal emulator because it's a lot
lighter then xterm, although both should handle UTF-8. You should use a
unicode font though. I put the following in my ~/.Xresources:

  ! for xterm
  XTerm*foreground: white
  XTerm*background: #010040
  XTerm*utf8: 2
  XTerm*font: -misc-fixed-medium-r-normal--14-130-75-75-c-70-iso10646-1
  XTerm*title: Shell
  XTerm*loginShell: True
  XTerm*scrollBar: False
  XTerm*saveLines: 0
  XTerm*ttyModes: erase ^H
  XTerm*vt100.translations: #override \
Home:  string(\033[1~) \n\
Delete:string(\033[3~) \n\
End:   string(\033[4~)

  ! for urxvt
  Rxvt*foreground: white
  Rxvt*background: #010040
  Rxvt*font: -misc-fixed-medium-r-normal--14-130-75-75-c-70-iso10646-1
  urxvt_transp*font: -misc-fixed-medium-r-normal--14-130-75-75-c-70-iso10646-1
  Rxvt*title: Shell
  Rxvt*loginShell: True
  Rxvt*scrollBar: False
  Rxvt*saveLines: 0

The critical part is the font specification; it should end with iso10646-1.

My /etc/csh.cshrc has some settings for less:

  setenv  LESSOPEN'|/usr/bin/lesspipe.sh %s'
  setenv  LESSCHARSET utf-8

Mutt has to be told as well, in ~/.muttrc:

  set charset=utf-8
  set send_charset=us-ascii:iso-8859-15:utf-8

In ~/.emacs.el(c) there are some settings as well:

  ;; Set language environment for MULE. 
  (set-language-environment 'UTF-8)

  ;; My customization for text modes
  (defun my-text-mode-hook ()
(auto-fill-mode 1)
(show-paren-mode t)
(activate-input-method 'rfc1345) ; Good input method for UTF-8
  )
(add-hook 'text-mode-hook 'my-text-mode-hook)

Other programs you should look at are Firefox: edit - preferences -
content tab - Font  Colors, advanced button; default encoding -
select Unicode (UTF-8).

Other programs may have settings for unicode, but these are the ones
that spring to mind.

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgphIMNUJunW8.pgp
Description: PGP signature


Re: how can i use ISO-8859-1??

2008-09-09 Thread Lars Eighner

On Tue, 9 Sep 2008, Roland Smith wrote:


On Mon, Sep 08, 2008 at 09:35:07PM -0700, Gary Kline wrote:

Guys,

This is one of the I've-been-meaning-to-ask questions;
but other things keep happening that took precedence.  Now
it's time to ask what are the voodoo commands to set up in my
~/.zshrc or other initiation files (probably including my muttrc)
that will let me print to stdout, characters like the e-aigu
or u-umlaut and the currency pound or Euro?


The euro is not in iso-8859-1, but iso-8859-15.  You need to
load the appropriate fonts (at boot if you are root, see /etc/rc.conf)
or use vidcontrol to load the iso fonts when you log in.  You
need to set your TERM environmental variable to the appropriate
value in your shell rc.  That might be cons25l1.  You can check out termcap
from a link in /etc.




Why settle for ISO-8859-1? Switch to UTF-8 instead, wich can display a
much larger number of characters, and is becoming the standard.


Because it is a hiddeous waste for most readers and writers of
English and other European languages.



I added the following to the 'setenv' section of the 'default' profile
in login.conf:

  LC_ALL=en_US.UTF-8

AFAICT, the console doesn't have UTF-8 fonts (yet?).


It won't until video cards support this level of bloat.  I don't know of a
single video card that does that.


But that doesn't
bother me because I always use X anyway.


Wouldn't you really be happier with Windoz?


--
Lars Eighner
http://www.larseighner.com/index.html
8800 N IH35 APT 1191 AUSTIN TX 78753-5266

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: how can i use ISO-8859-1??

2008-09-09 Thread Dave Feustel
On Tue, Sep 09, 2008 at 12:14:47PM -0500, Lars Eighner wrote:
 On Tue, 9 Sep 2008, Roland Smith wrote:

 On Mon, Sep 08, 2008 at 09:35:07PM -0700, Gary Kline wrote:
 Guys,

 This is one of the I've-been-meaning-to-ask questions;
 but other things keep happening that took precedence.  Now
 it's time to ask what are the voodoo commands to set up in my
 ~/.zshrc or other initiation files (probably including my muttrc)
 that will let me print to stdout, characters like the e-aigu
 or u-umlaut and the currency pound or Euro?

 The euro is not in iso-8859-1, but iso-8859-15.  You need to
 load the appropriate fonts (at boot if you are root, see /etc/rc.conf)
 or use vidcontrol to load the iso fonts when you log in.  You
 need to set your TERM environmental variable to the appropriate
 value in your shell rc.  That might be cons25l1.  You can check out termcap
 from a link in /etc.



 Why settle for ISO-8859-1? Switch to UTF-8 instead, wich can display a
 much larger number of characters, and is becoming the standard.

 Because it is a hiddeous waste for most readers and writers of
 English and other European languages.

What ISO supports English, German, French, and Russian?
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: how can i use ISO-8859-1??

2008-09-09 Thread Gary Kline
On Tue, Sep 09, 2008 at 06:54:56PM +0200, Roland Smith wrote:
 On Mon, Sep 08, 2008 at 09:35:07PM -0700, Gary Kline wrote:
  Guys,
  
  This is one of the I've-been-meaning-to-ask questions;
  but other things keep happening that took precedence.  Now
  it's time to ask what are the voodoo commands to set up in my
  ~/.zshrc or other initiation files (probably including my muttrc)
  that will let me print to stdout, characters like the e-aigu
  or u-umlaut and the currency pound or Euro?
  
 Why settle for ISO-8859-1? Switch to UTF-8 instead, wich can display a
 much larger number of characters, and is becoming the standard.
 
 I added the following to the 'setenv' section of the 'default' profile
 in login.conf:
 
LC_ALL=en_US.UTF-8
 
 AFAICT, the console doesn't have UTF-8 fonts (yet?). But that doesn't
 bother me because I always use X anyway.
 
 So I added the following to my ~/.xinitrc as well:
 
   export LANG=en_US.UTF-8
 
 I installed the rxvt-unicode terminal emulator because it's a lot
 lighter then xterm, although both should handle UTF-8. You should use a
 unicode font though. I put the following in my ~/.Xresources:


I had something like what you've got below all the years I used
Ctwm, either in ~/.xinitrc or ~/.Xresources.  With more
customization in ~/.ctwmrc.  Now I'm using primarily KDE and used
to their Konsole hack of xterm.  Any idea of a URL that has this
level of utf-8 for konsole?
 
   ! for xterm
   XTerm*foreground: white
   XTerm*background: #010040
   XTerm*utf8: 2

[[ saved away ]]
   Rxvt*font: -misc-fixed-medium-r-normal--14-130-75-75-c-70-iso10646-1
   urxvt_transp*font: -misc-fixed-medium-r-normal--14-130-75-75-c-70-iso10646-1
   Rxvt*title: Shell
   Rxvt*scrollBar: False
   Rxvt*saveLines: 0
 
 The critical part is the font specification; it should end with iso10646-1.


I used some times-new-roman, but it shouldn't matter as long as
either xterm on konsole is there.  I hope!

 
 My /etc/csh.cshrc has some settings for less:
 
   setenv  LESSOPEN'|/usr/bin/lesspipe.sh %s'
   setenv  LESSCHARSET utf-8
 
 Mutt has to be told as well, in ~/.muttrc:
 
   set charset=utf-8
   set send_charset=us-ascii:iso-8859-15:utf-8
 
 In ~/.emacs.el(c) there are some settings as well:
 
   ;; Set language environment for MULE. 
   (set-language-environment 'UTF-8)
 
   ;; My customization for text modes
   (defun my-text-mode-hook ()
 (auto-fill-mode 1)
 (show-paren-mode t)
 (activate-input-method 'rfc1345) ; Good input method for UTF-8
   )
 (add-hook 'text-mode-hook 'my-text-mode-hook)
 
 Other programs you should look at are Firefox: edit - preferences -
 content tab - Font  Colors, advanced button; default encoding -
 select Unicode (UTF-8).


All done, thanks.

 
 Other programs may have settings for unicode, but these are the ones
 that spring to mind.
 

Oh, let me brag about 1/1000th of a bit and announce that after
decades of study [on-off] I can read a wee bit of French.  Well,
given a French- English diction to translate every third word,
:-)  aint life great?  Oui!

gary


 Roland
 -- 
 R.F.Smith   http://www.xs4all.nl/~rsmith/
 [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
 pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)



-- 
 Gary Kline  [EMAIL PROTECTED]  http://www.thought.org  Public Service Unix
http://jottings.thought.org   http://transfinite.thought.org


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: how can i use ISO-8859-1??

2008-09-09 Thread Gary Kline
On Tue, Sep 09, 2008 at 12:14:47PM -0500, Lars Eighner wrote:
 On Tue, 9 Sep 2008, Roland Smith wrote:
 
 On Mon, Sep 08, 2008 at 09:35:07PM -0700, Gary Kline wrote:
 Guys,
 
 This is one of the I've-been-meaning-to-ask questions;
 but other things keep happening that took precedence.  Now
 it's time to ask what are the voodoo commands to set up in my
 ~/.zshrc or other initiation files (probably including my muttrc)
 that will let me print to stdout, characters like the e-aigu
 or u-umlaut and the currency pound or Euro?
 
 The euro is not in iso-8859-1, but iso-8859-15.  You need to
 load the appropriate fonts (at boot if you are root, see /etc/rc.conf)

Yeah, I fergot. 

 or use vidcontrol to load the iso fonts when you log in.  You
 need to set your TERM environmental variable to the appropriate
 value in your shell rc.  That might be cons25l1.  You can check out termcap
 from a link in /etc.
 
 
 
 Why settle for ISO-8859-1? Switch to UTF-8 instead, wich can display a
 much larger number of characters, and is becoming the standard.
 
 Because it is a hiddeous waste for most readers and writers of
 English and other European languages.


I got into several fistfights when I was doing error-translation
for BSD at work, and refused to use utf-8 because it didn't
support enough of the Chinese glyphs.  But enough, I suppose;
unless you want to create some severely obscure word.  I also
argured that utf-8 was a waste of a whole byte per char for most
of us.  

 
 
 I added the following to the 'setenv' section of the 'default' profile
 in login.conf:
 
   LC_ALL=en_US.UTF-8
 
 AFAICT, the console doesn't have UTF-8 fonts (yet?).
 
 It won't until video cards support this level of bloat.  I don't know of a
 single video card that does that.
 
 But that doesn't
 bother me because I always use X anyway.
 
 Wouldn't you really be happier with Windoz?
 

Errp.  I'm gonna lose my lunch!

gary


 
 -- 
 Lars Eighner
 http://www.larseighner.com/index.html
 8800 N IH35 APT 1191 AUSTIN TX 78753-5266
 

-- 
 Gary Kline  [EMAIL PROTECTED]  http://www.thought.org  Public Service Unix
http://jottings.thought.org   http://transfinite.thought.org


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: how can i use ISO-8859-1??

2008-09-09 Thread Roland Smith
On Tue, Sep 09, 2008 at 02:47:34PM -0700, Gary Kline wrote:
  I installed the rxvt-unicode terminal emulator because it's a lot
  lighter then xterm, although both should handle UTF-8. You should use a
  unicode font though. I put the following in my ~/.Xresources:
 
 
   I had something like what you've got below all the years I used
   Ctwm, either in ~/.xinitrc or ~/.Xresources.  With more
   customization in ~/.ctwmrc.  Now I'm using primarily KDE and used
   to their Konsole hack of xterm.  Any idea of a URL that has this
   level of utf-8 for konsole?

Doesn't konsole have a help menu? Otherwise check the konsole site:
http://konsole.kde.org/ It seems to have a handbook online. 

Maybe the View-Set Character Encoding menu option is what you're
looking for? 

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpBxPxlV6LBF.pgp
Description: PGP signature


Re: how can i use ISO-8859-1??

2008-09-09 Thread Roland Smith
On Tue, Sep 09, 2008 at 03:16:08PM -0700, Gary Kline wrote:
  Because it is a hiddeous waste for most readers and writers of
  English and other European languages.

   I also argured that utf-8 was a waste of a whole byte per char
   for most of us.

That's not true. UTF-8 is a variable-length encoding. It is backwards
compatible with ASCII, i.e. ascii characters are one byte in UTF-8 as
well. Are you thinking about UTF-16?

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpH65T3gWrFG.pgp
Description: PGP signature


Re: how can i use ISO-8859-1??

2008-09-09 Thread Polytropon
On Wed, 10 Sep 2008 00:28:14 +0200, Roland Smith [EMAIL PROTECTED] wrote:
 Doesn't konsole have a help menu? Otherwise check the konsole site:
 http://konsole.kde.org/ It seems to have a handbook online. 

Isn't there a man konsole? Oh wait, KDE doesn't have standard
manpages... :-) Never mind, I'm just joking.




-- 
Polytropon
From Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: how can i use ISO-8859-1??

2008-09-09 Thread Gary Kline
On Wed, Sep 10, 2008 at 12:39:41AM +0200, Roland Smith wrote:
 On Tue, Sep 09, 2008 at 03:16:08PM -0700, Gary Kline wrote:
   Because it is a hiddeous waste for most readers and writers of
   English and other European languages.
 
  I also argured that utf-8 was a waste of a whole byte per char
  for most of us.
 
 That's not true. UTF-8 is a variable-length encoding. It is backwards
 compatible with ASCII, i.e. ascii characters are one byte in UTF-8 as
 well. Are you thinking about UTF-16?


I don't know.  (Mark Twain.)  Back in the late 1990's I was
assigned the project of converting all the utilities I had ported
to three European languages.  Until now I had no idea there was
anything *but* utf-16, i.e. 2-bytes/char.  

With memory seriously getting to be dirt-cheap, wasting 8-bits
doesn't seem that big a deal.  Maybe some future wizard will
invent a UTF-32 that will hold all ~90 000 Chinese characters and
these will be downsized automatically to UTF-8 when you're mixing
Mandarin with, say, Cesk [Czeck].  

Hmm, somebody just told me that aigu is not English but French
and means acute.  ...all these years i thought ... oh well.
Anyway, do you know if '\0351' is a 16-bit character?  is is 0xE9
and decimal 233 and certaing should fit into a byte.   just
wondering.

gary


 
 Roland
 -- 
 R.F.Smith   http://www.xs4all.nl/~rsmith/
 [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
 pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)



-- 
 Gary Kline  [EMAIL PROTECTED]  http://www.thought.org  Public Service Unix
http://jottings.thought.org   http://transfinite.thought.org


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: how can i use ISO-8859-1??

2008-09-08 Thread Giorgos Keramidas
On Mon, 8 Sep 2008 21:35:07 -0700, Gary Kline [EMAIL PROTECTED] wrote:
 Guys,

 This is one of the I've-been-meaning-to-ask questions; but other
 things keep happening that took precedence.  Now it's time to ask what
 are the voodoo commands to set up in my ~/.zshrc or other initiation
 files (probably including my muttrc) that will let me print to stdout,
 characters like the e-aigu or u-umlaut and the currency pound or
 Euro?

 I keep running into '\240' characters that are likely M$ format
 commands. [...]

That's not really an ISO 8859-1 problem, but a locale setup issue.

In my .bashrc file I have the following:

# Locale setup.
export LANG=C
export LC_CTYPE=el_GR.ISO8859-7
export LC_COLLATE=el_GR.ISO8859-7
unset LC_ALL LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME

You can use something similar to set things up for `en_US.ISO8859-1':

# Locale setup.
export LANG=C
export LC_CTYPE=en_US.ISO8859-1
export LC_COLLATE=en_US.ISO8859-1
unset LC_ALL LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME

If you want _everything_ to be displayed using the standard en_US
conventions for en_US.ISO8859-1, you can alternatively use:

export LANG=C
export LC_ALL=en_US.ISO8859-1
unset LC_CTYPE LC_COLLATE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME

and let LC_ALL override everything.

A slightly better idea (which doesn't hardcode LANG and LC_ALL for all
shell instances) is to configure your personal `.login_conf' file with
something like:

me:\
:charset=iso-8859-1:\
:lang=en_US.ISO8859-1:\
:setenv=LC_ALL=en_US.ISO8859-1:

With this in place you will get the 'correct' environment regardless of
the login shell you are using: bash, csh or zsh.

Note: By avoiding hardcoded locale setup in your shell startup file
you can even spawn sub-shells with different locales.  Here's how
a zsh session with `en_US.ISO8859-1' can spawn a ksh session with a
Greek locale for example:

zsh env | egrep '^(LANG|LC_ALL)'
LANG=en_US.ISO8859-1
LC_ALL=en_US.ISO8859-1
zsh env LANG='el_GR.ISO8859-7' LC_ALL='el_GR.ISO8859-7' ksh
ksh$ mutt

Note that this is only ``half of the setup'' though.  You will then have
to make sure that your terminal emulator can display ISO 8859-1 text
correctly, by choosing an appropriate font set.  The xlsfonts(1) and the
fc-list(1) utilities can show you a list of installed fonts:

# xlsfonts | fgrep '8859-1'
# fc-list

Pick one that includes ISO 8859-1 characters, and off you go :)

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]