Bug#451382: i18n is NOT so easy!

2010-01-24 Thread Dr. Markus Waldeck
 Are you sure that this hasn't been fixed? 
 It should only be giving
 you German if the charset is UTF-8, and then it is using $'\u00e4'.

LANG=de_DE.UTF-8
XTERM_LOCALE=de_DE.UTF-8

% zsh --version
zsh 4.3.10 (i686-pc-linux-gnu)

% cut --
...
--complement-- das Komplement der Menge der gew$'u00e4'hlten Bytes, 
Zeichen oder Felder bilden
...

The constructs $'\u00e4' and $'\\u00e4' are working in the command line  

% echo $'\u00e4'
ä

but obviously NOT in the completion file as already mentioned 12 Dec 2007.

-- 
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#451382: i18n is NOT so easy!

2010-01-23 Thread Dr. Markus Waldeck
Hi,

#451382 is still open.

Please implement one of the following solutions:

1. use gew$(echo \\u00e4)hlten
2. use gewaehlten
3. remove the German translation at all!

Thanks

Markus

-- 
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#451382: i18n is NOT so easy!

2010-01-23 Thread Clint Adams
On Sat, Jan 23, 2010 at 01:07:16PM +0100, Dr. Markus Waldeck wrote:
 #451382 is still open.
 
 Please implement one of the following solutions:
 
 1. use gew$(echo \\u00e4)hlten
 2. use gewaehlten
 3. remove the German translation at all!

Are you sure that this hasn't been fixed?  It should only be giving
you German if the charset is UTF-8, and then it is using $'\u00e4'.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#451382: i18n is NOT so easy!

2007-12-12 Thread Dr. Markus Waldeck
 You can use $'\u00e4' which wouldn't be quite so ugly.

This solution is implemented in the package 4.3.4-dev-4-1 and does not work!

With LANG=de_DE.UTF-8 I have to use $'\\u00e4' in the command line
but I did not find a solution that is working in a completion file.

PS: I tested my suggested solution (gew$(print \\u00e4)hlten)!

-- 
Pt! Schon vom neuen GMX MultiMessenger gehört?
Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-09 Thread Peter Stephenson
On Fri, 07 Dec 2007 17:26:57 +
Peter Stephenson [EMAIL PROTECTED] wrote:
 Clint Adams wrote:
  On Fri, Dec 07, 2007 at 02:11:41PM +, Peter Stephenson wrote:
   Found it: see thread around
   
   http://www.zsh.org/mla/workers/2006/msg00753.html
  
  I think it would be easier to do something like bash's $ interface to
  gettext and co-opt that for completion translations.
 
 As far as I understood it (it doesn't seem to be well documented) that
 only does translations which are pre-compiled into the shell (or rather
 its libraries).  We need something which can be updated with completion
 functions.  It's OK if the definitions are in another file (though we
 could presumably have an interface which adds translations from the
 completion function itself) but it needs to be added at run time.
 
 Possibly we can still do this with $..., but I don't like the idea
 that if you change the original message you can no longer find the
 translation, which seems to me to be asking for trouble.

Further thoughts after groping through the gettext documentation for a
bit... this is not a definitive answer (though rather closer to one than
when I originally wrote that two hours and counting ago) but unless I post
it now I'll forget it.  A summary is that I believe we can use the
internationalization functions in the library behind gettext(), to avoid
reinventing the wheel and maintain some compatibility, but it'll take a
bit more care to get this right than simply $msgid plus
gettext(msgid).

I think we have two basic problems with the simplest $... / gettext()
interface.


1.

The problem in the last paragraph quoted.  I'm convinced this is a real
problem:  unlike with C programmes, the urge to tinker with strings in
shell functions is strong and if there's no visual cue that this has bad
side effects then the interface is, in my view, fundamentally broken.
To put it another way, only programmers tinker with C programmes while
users are actively encouraged to tinker with shell functions, so the
whole nature of the interface needs to be rethought to make it clear and
robust rather than minimal.

However, this isn't insuperable.  The msgid is only by convention the
original string and could be anything; it was designed to be simple in
the case of having many calls to gettext() throughout a programme.  As
we essentially have only one point of entry for translations in shell
functions (the shell's C code is a separate and much simpler problem
since this isn't fundamentally different from any other C programme), we
can do it how we like.  We can, for example, have translation strings
like:

$_mount_nfs_access_acregmin:specify cached file attributes minimum hold time

and have the following rule:

- If the string is in the form
identifer_character * : . *
  (we might need to make this more complicated eventually), first
  attempt look-up with the identifier characters.  If the lookup doesn't
  return the original string, this is the text we want.
- Otherwise look up with the whole string.  This is for compatibility.  Use
  of this in zsh functions would be deprecated.
- If it still returns the original string but there is an identifier
  part, return the string after the :.
- Maybe we want some rule about aliasing, it's not clear (we can leave
  it until a use becomes obvious). 

This scheme has various merits:  (i) it is robust about changes to the
English text (ii) the explicit msgid serves as a visual cue that
there's something here that shouldn't be monkeyed with without good
reason (and that even if you change the English text it should mean the
same thing) (iii) the msgid in the catalogues is compact.


2.

Unfortunately there's also the problem of finding message catalogues.
For the same reason that it's designed for simplicity with pre-compiled
programmes, gettext() itself appears to require them to be in a
particular hierarchy the top of which is determined at compile time.

This isn't good enough in our case.  We have functions that are
installed at different places in the function path.  The path can change
and the only clean way of finding message catalogues is using the same
path.  We *could* collect all translations at shell installation and
simply shrug our shoulders saying that's your lot, but in my view this
is too botched to consider.  (As far as I can tell this is what happens
in bash.)  It's a key part of the way the completion system works that
people can customize it themselves just by writing functions, and even
if adding translations to your own functions is unusual I still don't
think being limited to a predefined set is acceptable.  I don't mind
users (which includes administrators) having to run some utility to add,
or add to, a message catalogue, but I do mind them having to modify the
shell configuration and reinstall; even updating the shell libraries
with something like one of Clint's out-of-tree modules seems a bit over
the top.

However, it seems like we can get something better 

Bug#451382: i18n is NOT so easy!

2007-12-09 Thread Bart Schaefer
On Dec 9,  6:01pm, Peter Stephenson wrote:
} Subject: Re: Bug#451382: i18n is NOT so easy!
}
} This scheme has various merits: (i) it is robust about changes to
} the English text (ii) the explicit msgid serves as a visual cue that
} there's something here that shouldn't be monkeyed with without good
} reason (and that even if you change the English text it should mean
} the same thing) (iii) the msgid in the catalogues is compact.

This is close to the same scheme that I [*] adopted for localization
of zmail twelve years ago.  Except that we used a two-argument C
macro with the msgid and English text, rather than a delimited string.

We also had a number of tools that massaged the C source to add any new
msgid where a programmer had forgotten to use one, and to extract and
build the default English catalog file which could then be turned over
to translators.  It'd be pretty easy, I expect, to write a perl script
to find $... strings in shell scripts and extract them.

I'd be cautious about treating everything up to the first colon in a
$... string as a msgid key, though.  Error messages are going to
look like $thing that failed: reason it failed a lot of the time.  Or
would that have to be written thing that failed: $reason it failed
for this to work in the first place?  Anyway, it might be better to
adopt something like ${msgid}original text and treat both ${message}
and $message the same when only one of the two parts is found.

An additional issue that zsh may or may not have to address is that
you need entirely separate strings for things like plurals.  You can't
localize something like:

There %s %d thing%s in the bucket

where the %s get replaced by are and s when the %d is not 1, and
is and  otherwise.  You must instead have two strings (sometimes
three for the zero case):

There are %d things in the bucket
There is 1 thing in the bucket
There is nothing in the bucket

There are gobs of other niggling details that I'm sure I've forgotten.

} However, it seems like we can get something better by interfacing to
} the library at a lower level, in particular to catopen() (strictly
} this is a different family of interfaces). That accepts an absolute
} path to a catalogue and also uses the environment variable NLSPATH to
} search for files.

This is also what I did back then in zmail -- gettext() didn't really
even exist yet at that point, at least not in a fully-developed form.
The POSIX cat*() interfaces work just fine, though NLSPATH searching
has some pretty nasty bugs on older operating systems.

[*] That's sort of the royal I as actually there was a whole team
of people working for me on it.

-- 



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-09 Thread Peter Stephenson
Bart Schaefer wrote:
 I'd be cautious about treating everything up to the first colon in a
 $... string as a msgid key, though.  Error messages are going to
 look like $thing that failed: reason it failed a lot of the time.  Or
 would that have to be written thing that failed: $reason it failed
 for this to work in the first place?  Anyway, it might be better to
 adopt something like ${msgid}original text and treat both ${message}
 and $message the same when only one of the two parts is found.

Fine, that gives us an easier test for whether there's a special msgid
anyway.  (I would still propose that msgid has to consist only of shell
identifier characters for simplicity.)

-- 
Peter Stephenson [EMAIL PROTECTED]
Web page now at http://homepage.ntlworld.com/p.w.stephenson/



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-07 Thread Peter Stephenson
On Thu, 6 Dec 2007 11:10:22 -0500
Clint Adams [EMAIL PROTECTED] wrote:
 On Thu, Dec 06, 2007 at 06:08:55PM +0200, Ismail Dönmez wrote:
  nothing as in it wouldn't be useful? Imho it would be useful for warning 
  messages like do you want to delete all files etc.
 
 Certainly it would be useful.

We need a completion framework for translation which might or might not use
gettext in the back end.  There were some discussions about this a while
ago, I think about the time we first got the line editor to use multibyte
characters, but I can't find them now.  This is a big project somebody will
have to volunteer for.

-- 
Peter Stephenson [EMAIL PROTECTED]  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK  Tel: +44 (0)1223 692070




Bug#451382: i18n is NOT so easy!

2007-12-07 Thread Oliver Kiddle
Clint Adams wrote:
 On Thu, Dec 06, 2007 at 05:56:12PM +0100, Dr. Markus Waldeck wrote:
  Please correct _cut as mentioned in my mail from Fri, 16 Nov 2007 10:16:44 
  +0100.
 
 Okay, but it is certainly ugly.

You can use $'\u00e4' which wouldn't be quite so ugly.

Oliver



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-07 Thread Peter Stephenson
Peter Stephenson wrote:
 On Thu, 6 Dec 2007 11:10:22 -0500
 Clint Adams [EMAIL PROTECTED] wrote:
 We need a completion framework for translation which might or might not use
 gettext in the back end.  There were some discussions about this a while
 ago, I think about the time we first got the line editor to use multibyte
 characters, but I can't find them now.

Found it: see thread around

http://www.zsh.org/mla/workers/2006/msg00753.html

-- 
Peter Stephenson [EMAIL PROTECTED]  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK  Tel: +44 (0)1223 692070



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-07 Thread Peter Stephenson
Clint Adams wrote:
 On Fri, Dec 07, 2007 at 02:11:41PM +, Peter Stephenson wrote:
  Found it: see thread around
  
  http://www.zsh.org/mla/workers/2006/msg00753.html
 
 I think it would be easier to do something like bash's $ interface to
 gettext and co-opt that for completion translations.

As far as I understood it (it doesn't seem to be well documented) that
only does translations which are pre-compiled into the shell (or rather
its libraries).  We need something which can be updated with completion
functions.  It's OK if the definitions are in another file (though we
could presumably have an interface which adds translations from the
completion function itself) but it needs to be added at run time.

Possibly we can still do this with $..., but I don't like the idea
that if you change the original message you can no longer find the
translation, which seems to me to be asking for trouble.

-- 
Peter Stephenson [EMAIL PROTECTED]  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK  Tel: +44 (0)1223 692070



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-07 Thread Clint Adams
On Fri, Dec 07, 2007 at 02:11:41PM +, Peter Stephenson wrote:
 Found it: see thread around
 
 http://www.zsh.org/mla/workers/2006/msg00753.html

I think it would be easier to do something like bash's $ interface to
gettext and co-opt that for completion translations.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-06 Thread Clint Adams
On Wed, Dec 05, 2007 at 09:08:25PM +0100, Dr. Markus Waldeck wrote:
 I am waiting for a useful answer which is necessary for further contributions!

I don't think I have one.  Nothing in zsh is particularly suited to
gettext or translations.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-06 Thread Ismail Dönmez
Thursday 06 December 2007 17:54:36 Clint Adams yazmıştı:
 On Wed, Dec 05, 2007 at 09:08:25PM +0100, Dr. Markus Waldeck wrote:
  I am waiting for a useful answer which is necessary for further
  contributions!

 I don't think I have one.  Nothing in zsh is particularly suited to
 gettext or translations.

nothing as in it wouldn't be useful? Imho it would be useful for warning 
messages like do you want to delete all files etc.

Regards,
ismail

-- 
Never learn by your mistakes, if you do you may never dare to try again.




Bug#451382: i18n is NOT so easy!

2007-12-06 Thread Clint Adams
On Thu, Dec 06, 2007 at 06:08:55PM +0200, Ismail Dönmez wrote:
 nothing as in it wouldn't be useful? Imho it would be useful for warning 
 messages like do you want to delete all files etc.

Certainly it would be useful.




Bug#451382: i18n is NOT so easy!

2007-12-06 Thread Dr. Markus Waldeck
 I don't think I have one.  Nothing in zsh is particularly suited to
 gettext or translations.

Please correct _cut as mentioned in my mail from Fri, 16 Nov 2007 10:16:44 
+0100.



-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-12-06 Thread Clint Adams
On Thu, Dec 06, 2007 at 05:56:12PM +0100, Dr. Markus Waldeck wrote:
 Please correct _cut as mentioned in my mail from Fri, 16 Nov 2007 10:16:44 
 +0100.

Okay, but it is certainly ugly.

Index: Completion/Unix/Command/_cut
===
RCS file: /cvsroot/zsh/zsh/Completion/Unix/Command/_cut,v
retrieving revision 1.2
diff -u -r1.2 _cut
--- Completion/Unix/Command/_cut31 Oct 2007 00:35:37 -  1.2
+++ Completion/Unix/Command/_cut6 Dec 2007 19:04:17 -
@@ -11,7 +11,7 @@
  delimiter   Delimiter anstelle von Tabulator als Trenner 
benutzen
  fields  nur diese Felder und alle Zeilen OHNE 
Trennzeichen ausgeben
  n   (ignoriert)
- complement  das Komplement der Menge der gewählten Bytes, 
Zeichen oder Felder bilden
+ complement  das Komplement der Menge der gew$(print 
\\u00e4)hlten Bytes, Zeichen oder Felder bilden
  only-delimited  keine Zeilen ausgeben, die keinen Trenner 
enthalten
  output-delimiter Zeichenkette als Ausgabetrennzeichen benutzen
  helpdiese Hilfe anzeigen und beenden




Bug#451382: i18n is NOT so easy!

2007-12-05 Thread Dr. Markus Waldeck

I am waiting for a useful answer which is necessary for further contributions!

Dr. Markus Waldeck


-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-11-16 Thread Dr. Markus Waldeck

Hi,

the problem is caused by the incompatible encoding of the ae-Umlaut!

I found an ugly possibility to print a unicode character from an non unicode 
encoded file:

diff _cut _cut.patched 
14c14
  complement  das Komplement der Menge der gew�lten Bytes, 
Zeichen oder Felder bilden
---
  complement  das Komplement der Menge der gew$(echo 
 \\u00e4)hlten Bytes, Zeichen oder Felder bilden

Thnaks!

Markus

-- 
Pt! Schon vom neuen GMX MultiMessenger gehört?
Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#451382: i18n is NOT so easy!

2007-11-15 Thread Dr. Markus Waldeck

Package: zsh
Version: 4.3.4-26
Severity: normal

I noticed following problems with _cut:

1. The encondig ISO-8859 mangles UTF-8 encoded Umlaute (gew\M-dhlten)
2. The descriptions are not displayes correctly if ISO-8859 is used. 
   There is no problem if UTF-8 is used.

% echo $LANG
de_DE.UTF-8

% file /usr/share/zsh/4.3.4/functions/Completion/Unix/_cut
/usr/share/zsh/4.3.4/functions/Completion/Unix/_cut: ISO-8859
English text

% cut -
--bytes 
--characters
--complement-s
--delimiter 
--fields
--help  -- nur diese Bytes ausgeben 
  
-n  -- nur diese Zeichen ausgeben   
  
--only-delimited-- das Komplement der Menge der gew\M-dhlten Bytes,
Zeichen oder Felder bilden   
--output-delimiter  -- Delimiter anstelle von Tabulator als Trenner
benutzen  
--version   -- nur diese Felder und alle Zeilen OHNE
Trennzeichen ausgeben
-b  -- diese Hilfe anzeigen und beenden 
  
-c  -- (ignoriert)  
  
-- keine Zeilen ausgeben, die keinen Trenner
enthalten
-d  -- Zeichenkette als
Ausgabetrennzeichen benutzen
  
-f  -- Versionsinformation anzeigen
und beenden 
  


% file /usr/share/zsh/4.3.4/functions/Completion/Unix/_cut
/usr/share/zsh/4.3.4/functions/Completion/Unix/_cut: UTF-8 Unicode
English text

% cut -
--bytes -b  -- nur diese Bytes ausgeben
--characters-c  -- nur diese Zeichen ausgeben
--complement-- das Komplement der Menge der gewählten Bytes,
Zeichen oder Felder bilden   
--delimiter -d  -- Delimiter anstelle von Tabulator als Trenner
benutzen  
--fields-f  -- nur diese Felder und alle Zeilen OHNE
Trennzeichen ausgeben
--help  -- diese Hilfe anzeigen und beenden 
  
-n  -- (ignoriert)  
  
--only-delimited-s  -- keine Zeilen ausgeben, die keinen Trenner
enthalten
--output-delimiter  -- Zeichenkette als Ausgabetrennzeichen benutzen
  
--version   -- Versionsinformation anzeigen und beenden   

-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]