Bug#451382: i18n is NOT so easy!
Are you sure that this hasn't been fixed? It should only be giving you German if the charset is UTF-8, and then it is using $'\u00e4'. LANG=de_DE.UTF-8 XTERM_LOCALE=de_DE.UTF-8 % zsh --version zsh 4.3.10 (i686-pc-linux-gnu) % cut -- ... --complement-- das Komplement der Menge der gew$'u00e4'hlten Bytes, Zeichen oder Felder bilden ... The constructs $'\u00e4' and $'\\u00e4' are working in the command line % echo $'\u00e4' ä but obviously NOT in the completion file as already mentioned 12 Dec 2007. -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#451382: i18n is NOT so easy!
Hi, #451382 is still open. Please implement one of the following solutions: 1. use gew$(echo \\u00e4)hlten 2. use gewaehlten 3. remove the German translation at all! Thanks Markus -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#451382: i18n is NOT so easy!
On Sat, Jan 23, 2010 at 01:07:16PM +0100, Dr. Markus Waldeck wrote: #451382 is still open. Please implement one of the following solutions: 1. use gew$(echo \\u00e4)hlten 2. use gewaehlten 3. remove the German translation at all! Are you sure that this hasn't been fixed? It should only be giving you German if the charset is UTF-8, and then it is using $'\u00e4'. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#451382: i18n is NOT so easy!
You can use $'\u00e4' which wouldn't be quite so ugly. This solution is implemented in the package 4.3.4-dev-4-1 and does not work! With LANG=de_DE.UTF-8 I have to use $'\\u00e4' in the command line but I did not find a solution that is working in a completion file. PS: I tested my suggested solution (gew$(print \\u00e4)hlten)! -- Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
On Fri, 07 Dec 2007 17:26:57 + Peter Stephenson [EMAIL PROTECTED] wrote: Clint Adams wrote: On Fri, Dec 07, 2007 at 02:11:41PM +, Peter Stephenson wrote: Found it: see thread around http://www.zsh.org/mla/workers/2006/msg00753.html I think it would be easier to do something like bash's $ interface to gettext and co-opt that for completion translations. As far as I understood it (it doesn't seem to be well documented) that only does translations which are pre-compiled into the shell (or rather its libraries). We need something which can be updated with completion functions. It's OK if the definitions are in another file (though we could presumably have an interface which adds translations from the completion function itself) but it needs to be added at run time. Possibly we can still do this with $..., but I don't like the idea that if you change the original message you can no longer find the translation, which seems to me to be asking for trouble. Further thoughts after groping through the gettext documentation for a bit... this is not a definitive answer (though rather closer to one than when I originally wrote that two hours and counting ago) but unless I post it now I'll forget it. A summary is that I believe we can use the internationalization functions in the library behind gettext(), to avoid reinventing the wheel and maintain some compatibility, but it'll take a bit more care to get this right than simply $msgid plus gettext(msgid). I think we have two basic problems with the simplest $... / gettext() interface. 1. The problem in the last paragraph quoted. I'm convinced this is a real problem: unlike with C programmes, the urge to tinker with strings in shell functions is strong and if there's no visual cue that this has bad side effects then the interface is, in my view, fundamentally broken. To put it another way, only programmers tinker with C programmes while users are actively encouraged to tinker with shell functions, so the whole nature of the interface needs to be rethought to make it clear and robust rather than minimal. However, this isn't insuperable. The msgid is only by convention the original string and could be anything; it was designed to be simple in the case of having many calls to gettext() throughout a programme. As we essentially have only one point of entry for translations in shell functions (the shell's C code is a separate and much simpler problem since this isn't fundamentally different from any other C programme), we can do it how we like. We can, for example, have translation strings like: $_mount_nfs_access_acregmin:specify cached file attributes minimum hold time and have the following rule: - If the string is in the form identifer_character * : . * (we might need to make this more complicated eventually), first attempt look-up with the identifier characters. If the lookup doesn't return the original string, this is the text we want. - Otherwise look up with the whole string. This is for compatibility. Use of this in zsh functions would be deprecated. - If it still returns the original string but there is an identifier part, return the string after the :. - Maybe we want some rule about aliasing, it's not clear (we can leave it until a use becomes obvious). This scheme has various merits: (i) it is robust about changes to the English text (ii) the explicit msgid serves as a visual cue that there's something here that shouldn't be monkeyed with without good reason (and that even if you change the English text it should mean the same thing) (iii) the msgid in the catalogues is compact. 2. Unfortunately there's also the problem of finding message catalogues. For the same reason that it's designed for simplicity with pre-compiled programmes, gettext() itself appears to require them to be in a particular hierarchy the top of which is determined at compile time. This isn't good enough in our case. We have functions that are installed at different places in the function path. The path can change and the only clean way of finding message catalogues is using the same path. We *could* collect all translations at shell installation and simply shrug our shoulders saying that's your lot, but in my view this is too botched to consider. (As far as I can tell this is what happens in bash.) It's a key part of the way the completion system works that people can customize it themselves just by writing functions, and even if adding translations to your own functions is unusual I still don't think being limited to a predefined set is acceptable. I don't mind users (which includes administrators) having to run some utility to add, or add to, a message catalogue, but I do mind them having to modify the shell configuration and reinstall; even updating the shell libraries with something like one of Clint's out-of-tree modules seems a bit over the top. However, it seems like we can get something better
Bug#451382: i18n is NOT so easy!
On Dec 9, 6:01pm, Peter Stephenson wrote: } Subject: Re: Bug#451382: i18n is NOT so easy! } } This scheme has various merits: (i) it is robust about changes to } the English text (ii) the explicit msgid serves as a visual cue that } there's something here that shouldn't be monkeyed with without good } reason (and that even if you change the English text it should mean } the same thing) (iii) the msgid in the catalogues is compact. This is close to the same scheme that I [*] adopted for localization of zmail twelve years ago. Except that we used a two-argument C macro with the msgid and English text, rather than a delimited string. We also had a number of tools that massaged the C source to add any new msgid where a programmer had forgotten to use one, and to extract and build the default English catalog file which could then be turned over to translators. It'd be pretty easy, I expect, to write a perl script to find $... strings in shell scripts and extract them. I'd be cautious about treating everything up to the first colon in a $... string as a msgid key, though. Error messages are going to look like $thing that failed: reason it failed a lot of the time. Or would that have to be written thing that failed: $reason it failed for this to work in the first place? Anyway, it might be better to adopt something like ${msgid}original text and treat both ${message} and $message the same when only one of the two parts is found. An additional issue that zsh may or may not have to address is that you need entirely separate strings for things like plurals. You can't localize something like: There %s %d thing%s in the bucket where the %s get replaced by are and s when the %d is not 1, and is and otherwise. You must instead have two strings (sometimes three for the zero case): There are %d things in the bucket There is 1 thing in the bucket There is nothing in the bucket There are gobs of other niggling details that I'm sure I've forgotten. } However, it seems like we can get something better by interfacing to } the library at a lower level, in particular to catopen() (strictly } this is a different family of interfaces). That accepts an absolute } path to a catalogue and also uses the environment variable NLSPATH to } search for files. This is also what I did back then in zmail -- gettext() didn't really even exist yet at that point, at least not in a fully-developed form. The POSIX cat*() interfaces work just fine, though NLSPATH searching has some pretty nasty bugs on older operating systems. [*] That's sort of the royal I as actually there was a whole team of people working for me on it. -- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
Bart Schaefer wrote: I'd be cautious about treating everything up to the first colon in a $... string as a msgid key, though. Error messages are going to look like $thing that failed: reason it failed a lot of the time. Or would that have to be written thing that failed: $reason it failed for this to work in the first place? Anyway, it might be better to adopt something like ${msgid}original text and treat both ${message} and $message the same when only one of the two parts is found. Fine, that gives us an easier test for whether there's a special msgid anyway. (I would still propose that msgid has to consist only of shell identifier characters for simplicity.) -- Peter Stephenson [EMAIL PROTECTED] Web page now at http://homepage.ntlworld.com/p.w.stephenson/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
On Thu, 6 Dec 2007 11:10:22 -0500 Clint Adams [EMAIL PROTECTED] wrote: On Thu, Dec 06, 2007 at 06:08:55PM +0200, Ismail Dönmez wrote: nothing as in it wouldn't be useful? Imho it would be useful for warning messages like do you want to delete all files etc. Certainly it would be useful. We need a completion framework for translation which might or might not use gettext in the back end. There were some discussions about this a while ago, I think about the time we first got the line editor to use multibyte characters, but I can't find them now. This is a big project somebody will have to volunteer for. -- Peter Stephenson [EMAIL PROTECTED] Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
Bug#451382: i18n is NOT so easy!
Clint Adams wrote: On Thu, Dec 06, 2007 at 05:56:12PM +0100, Dr. Markus Waldeck wrote: Please correct _cut as mentioned in my mail from Fri, 16 Nov 2007 10:16:44 +0100. Okay, but it is certainly ugly. You can use $'\u00e4' which wouldn't be quite so ugly. Oliver -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
Peter Stephenson wrote: On Thu, 6 Dec 2007 11:10:22 -0500 Clint Adams [EMAIL PROTECTED] wrote: We need a completion framework for translation which might or might not use gettext in the back end. There were some discussions about this a while ago, I think about the time we first got the line editor to use multibyte characters, but I can't find them now. Found it: see thread around http://www.zsh.org/mla/workers/2006/msg00753.html -- Peter Stephenson [EMAIL PROTECTED] Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
Clint Adams wrote: On Fri, Dec 07, 2007 at 02:11:41PM +, Peter Stephenson wrote: Found it: see thread around http://www.zsh.org/mla/workers/2006/msg00753.html I think it would be easier to do something like bash's $ interface to gettext and co-opt that for completion translations. As far as I understood it (it doesn't seem to be well documented) that only does translations which are pre-compiled into the shell (or rather its libraries). We need something which can be updated with completion functions. It's OK if the definitions are in another file (though we could presumably have an interface which adds translations from the completion function itself) but it needs to be added at run time. Possibly we can still do this with $..., but I don't like the idea that if you change the original message you can no longer find the translation, which seems to me to be asking for trouble. -- Peter Stephenson [EMAIL PROTECTED] Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
On Fri, Dec 07, 2007 at 02:11:41PM +, Peter Stephenson wrote: Found it: see thread around http://www.zsh.org/mla/workers/2006/msg00753.html I think it would be easier to do something like bash's $ interface to gettext and co-opt that for completion translations. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
On Wed, Dec 05, 2007 at 09:08:25PM +0100, Dr. Markus Waldeck wrote: I am waiting for a useful answer which is necessary for further contributions! I don't think I have one. Nothing in zsh is particularly suited to gettext or translations. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
Thursday 06 December 2007 17:54:36 Clint Adams yazmıştı: On Wed, Dec 05, 2007 at 09:08:25PM +0100, Dr. Markus Waldeck wrote: I am waiting for a useful answer which is necessary for further contributions! I don't think I have one. Nothing in zsh is particularly suited to gettext or translations. nothing as in it wouldn't be useful? Imho it would be useful for warning messages like do you want to delete all files etc. Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again.
Bug#451382: i18n is NOT so easy!
On Thu, Dec 06, 2007 at 06:08:55PM +0200, Ismail Dönmez wrote: nothing as in it wouldn't be useful? Imho it would be useful for warning messages like do you want to delete all files etc. Certainly it would be useful.
Bug#451382: i18n is NOT so easy!
I don't think I have one. Nothing in zsh is particularly suited to gettext or translations. Please correct _cut as mentioned in my mail from Fri, 16 Nov 2007 10:16:44 +0100. -- Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
On Thu, Dec 06, 2007 at 05:56:12PM +0100, Dr. Markus Waldeck wrote: Please correct _cut as mentioned in my mail from Fri, 16 Nov 2007 10:16:44 +0100. Okay, but it is certainly ugly. Index: Completion/Unix/Command/_cut === RCS file: /cvsroot/zsh/zsh/Completion/Unix/Command/_cut,v retrieving revision 1.2 diff -u -r1.2 _cut --- Completion/Unix/Command/_cut31 Oct 2007 00:35:37 - 1.2 +++ Completion/Unix/Command/_cut6 Dec 2007 19:04:17 - @@ -11,7 +11,7 @@ delimiter Delimiter anstelle von Tabulator als Trenner benutzen fields nur diese Felder und alle Zeilen OHNE Trennzeichen ausgeben n (ignoriert) - complement das Komplement der Menge der gewählten Bytes, Zeichen oder Felder bilden + complement das Komplement der Menge der gew$(print \\u00e4)hlten Bytes, Zeichen oder Felder bilden only-delimited keine Zeilen ausgeben, die keinen Trenner enthalten output-delimiter Zeichenkette als Ausgabetrennzeichen benutzen helpdiese Hilfe anzeigen und beenden
Bug#451382: i18n is NOT so easy!
I am waiting for a useful answer which is necessary for further contributions! Dr. Markus Waldeck -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
Hi, the problem is caused by the incompatible encoding of the ae-Umlaut! I found an ugly possibility to print a unicode character from an non unicode encoded file: diff _cut _cut.patched 14c14 complement das Komplement der Menge der gew�lten Bytes, Zeichen oder Felder bilden --- complement das Komplement der Menge der gew$(echo \\u00e4)hlten Bytes, Zeichen oder Felder bilden Thnaks! Markus -- Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#451382: i18n is NOT so easy!
Package: zsh Version: 4.3.4-26 Severity: normal I noticed following problems with _cut: 1. The encondig ISO-8859 mangles UTF-8 encoded Umlaute (gew\M-dhlten) 2. The descriptions are not displayes correctly if ISO-8859 is used. There is no problem if UTF-8 is used. % echo $LANG de_DE.UTF-8 % file /usr/share/zsh/4.3.4/functions/Completion/Unix/_cut /usr/share/zsh/4.3.4/functions/Completion/Unix/_cut: ISO-8859 English text % cut - --bytes --characters --complement-s --delimiter --fields --help -- nur diese Bytes ausgeben -n -- nur diese Zeichen ausgeben --only-delimited-- das Komplement der Menge der gew\M-dhlten Bytes, Zeichen oder Felder bilden --output-delimiter -- Delimiter anstelle von Tabulator als Trenner benutzen --version -- nur diese Felder und alle Zeilen OHNE Trennzeichen ausgeben -b -- diese Hilfe anzeigen und beenden -c -- (ignoriert) -- keine Zeilen ausgeben, die keinen Trenner enthalten -d -- Zeichenkette als Ausgabetrennzeichen benutzen -f -- Versionsinformation anzeigen und beenden % file /usr/share/zsh/4.3.4/functions/Completion/Unix/_cut /usr/share/zsh/4.3.4/functions/Completion/Unix/_cut: UTF-8 Unicode English text % cut - --bytes -b -- nur diese Bytes ausgeben --characters-c -- nur diese Zeichen ausgeben --complement-- das Komplement der Menge der gewählten Bytes, Zeichen oder Felder bilden --delimiter -d -- Delimiter anstelle von Tabulator als Trenner benutzen --fields-f -- nur diese Felder und alle Zeilen OHNE Trennzeichen ausgeben --help -- diese Hilfe anzeigen und beenden -n -- (ignoriert) --only-delimited-s -- keine Zeilen ausgeben, die keinen Trenner enthalten --output-delimiter -- Zeichenkette als Ausgabetrennzeichen benutzen --version -- Versionsinformation anzeigen und beenden -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]