Am Samstag, 18. Juni 2005 19:46 schrieb Angus Leeming:
> On Friday 17 June 2005 21:07, Andre Poenitz wrote:
> > On Fri, Jun 17, 2005 at 11:20:05AM +0100, Angus Leeming wrote:
> > > Wolfgang Engelmann wrote:
> > > > Do I understand you correctly, that I should use at least under
> > > > SuSe 9.2 (and 9.3?) LyX xforms instead of QT? Or switch to
> > > > another distribution?
> > >
> > > Nope. It's sufficient to follow Georg's suggestion of invoking
> > > lyx through a wrapper script
> > >
> > > #! /bin/sh
> > > LANG=de_DE lyx $*
> >


> > What happened to "$@"?
I do not understand this


>
> Bah! sh, cmd.exe, they're the same, right?
>
> Angus (easily confused)

more so: Wolfgang

For those who understand German there is a good explanation of the problems 
dealt with here under 
http://www.tu-chemnitz.de/urz/linux/faq/unicode.html#Informationen_zur_neuen_Textzeic

But you folks are probably all familiar with it...

Here are some points translated:

 Standard ISO8859-x was used until now, where x stands for the different types 
such as western European 1 or 15...

 With the Unicode-Standard ISO-10646 every known sign used on earth is now 
coded unequivocal (that is you don´t need the x in ISO8859-x anymore to 
characterize which language you are dealing with)

SuSe Linux uses the Unicode standard in the code form UTF-8 (I am not quite 
sure here whether I understand completely the diference between 
Unicode-Standard ISO-10646 and UTF-8)

Problems might arise due to history and the coexistence of old and new 
systemes: 
*if you use in a new system old files
*if you use in an old and a new system the same files
*if the old system does not recognize the new coding or if the necessary font 
for visualizing the file is not available
*if the new system can not work with the old files

This affects especially  
* textfiles (descriptions in textformat, sourcecode, scripts, HTML-files, 
Emails, ...) 
*file names
*standard input, -output, -error, pipes 
*environment variables
*Cut&Paste-buffer 
*Remote-connections to other computers: Telnet, ssh, Modem,         
Serial-Port-connections to terminal emulators

Whether textfiles are recognized correctly, depends on the program dealing 
with the data: 
Numerous programs understand the old and the new coding form (ISO8859-x, 
UTF-8, ....), e.g. the desktops of KDE and GNOME and their programs.
Other programs allow to select the coding by environmental variables such as 
terminal windows or use infos in the documents (e.g Webbrowser). 

It is generally recommended to transform textfiles to Unicode UTF-8.
With
file <your-filename> you get the ISO info:

sh-3.00$ file dante-lyx-n.lyx
dante-lyx-n.lyx: ISO-8859 English text

Some examples:
File names are shown incorrectly, e.g. in the filemanager with 
ls
???.iso
Reason: File name not coded withUTF-8 and simultaneously non-ASCII-characters 
used
Solution: rename file (right mouse to file/directory>rename>enter new 
name>return
Prevention: Use only  A-Z, a-z, 0-9, _, - and . 
for filenames
do not use filenames with spaces, since they are often interpretated as space 

Conversion of ISO8859-1-textfiles (Shell-Scripts and so on) in UTF-8 (or in 
other coding): 
 Use the command
 iconv 

   iconv -f ISO8859-1 -t UTF-8 {file_in_iso8859_coding} 
>{fiele_in_utf8_coding}

for details see
 iconv --list 
or
 man iconv 
or under KDE: KDE-help > Unix-handbook -> (1)user commands> iconv 

Another command for conversion is
 recode 
(see also --help, or man recode, or KDE help) 

Texteditors  which understand UTF-8-code are for instance 
kwrite / kate 
 use
 -> open file>coding for the selection
 kwrite is a simple texteditor
 kate more comfortable
xemacs, emacs and vim work with ISO8859- as well as with UTF-8-code. 

  nedit understands ISO8859-code only
 

Problem: my program shows characters wrongly or not at all:
 Program does not recognize the coding of the text file resp. text. This can 
affect the graphical display of text files and/or the display of entered 
characters via the keyboard. 
Reasons can be manyfold.

1. UTF-8 is set (the LANG-variable has the ending .UTF-8 ) and the font for 
visualizing is in Unicode (ISO 10646-1): 
 The program has its own graphical window with text does not use the 
Standard-in/out via a terminal emulator (=terminal window)
2. Text is already utf-8 coded: 
Unicode has to be set via a menu (e.g. Webbrowsers which do not recognize the 
coding of the text). If the coding can not be set manually, a program able to 
handle unicode has to be used. 
If for some reason a program has to be used which can not handle unicode, the 
text has to be converted before with
 iconv
 into ISO8859 and afterward reconverted with it. The program itself has to be 
set to a LANG-Variable without _.UTF-8_-extension, in order to allow correct 
entering and conversion of the characters via the keyboard:
      Example 1:
         Texteditor "nedit" (better use "kwrite")
         iconv -f UTF-8 -t ISO8859-1 {file_in_utf8_format} 
>/tmp/{file_in_iso8859_format}
         LANG=de_DE nedit /tmp/{file_in_iso8859_format}
         iconv -f ISO8859-1 -t UTF-8 /tmp/{file_in_iso8859_format} 
>{datei_in_utf8_format}

        User of tcsh (another shell) use for the 2nd case 
         setenv LANG de_DE; nedit /tmp/{fiel_in_iso8859_format}

      Example 2: Show message "xmessage"
        If the program serves only to show a message and reads from the  
standard input, the following construct can be used:

         iconv -f UTF-8 -t ISO8859-1 {file_in_utf8_format}  | xmessage -file -  
 
3.Text is not utf-8-coded
 If possible, text file should be converted to UFT-8. If for some reason this 
is not wanted, the coding in the program has to be set by a menue (e.g. 
Webbrowsers which do not recognize the coding from the text. If this is not 
foreseen by the program, it uses the LANG-variable. In this case the 
LANG-variable should not contain the ending .UTF-8. 
             
                Example: Text editor "nedit"
          
                  LANG=de_DE nedit {file_in_iso8859_format}
                      
                 User of  the tcsh: 
          
                  setenv LANG de_DE; nedit {file_in_iso8859_format}

I stop here, although there is more info 

If you want a lyx file of this, let me know

The german text is from :

Gruppe Systemsoftware
2004-12-01 : r1.92
Technische Universität Chemnitz, Straße der Nationen 62, 09107 Chemnitz
Impressum - Copyright © 2003-2004 by TU Chemnitz, URZ, alle Rechte 
vorbehalten.


Reply via email to