Bug#448216: Support 3-letter codes for package description translations in APT (was: Re: Bug#448216: Waiting 2 years ago...)
2010/2/5 Marcos marcoscosta...@gmail.com: We're waiting more of 2 years ago. Don't (busy-)wait for it: Invest the time instead to provide a good patch for it. :) It sometimes also help (as you see now) to poke someone ~ but it is far better to do this with some more information: for example, i didn't know that this minor bug is blocking someones work! (the original bugreport only talks about private use) The priority had increased by far with this information… But enough storytelling: In this specific bug case a patch from me [0] for this and a few other Translation-file-specific things is pending for the next ABI-break which will happen sometime in the future (before squeeze). The relevant lines from the patch for this bug should be: // get the environment language code // we extract both, a long and a short code and then we will // check if we actually need both (rare) or if the short is enough string const envMsg = string(Locale == 0 ? std::setlocale(LC_MESSAGES, NULL) : Locale); size_t const lenShort = (envMsg.find('_') != string::npos) ? envMsg.find('_') : 2; size_t const lenLong = (envMsg.find('.') != string::npos) ? envMsg.find('.') : (lenShort + 3); string envLong = envMsg.substr(0,lenLong); string const envShort = envLong.substr(0,lenShort); I don't know if this really handles all cases as i don't know much about locales, but it should handle at least e.g. de, de_DE, ast_DE (bogus) and de_DE.UTF-8 and given that nobody else cared to provide a patch for it i guess this will be better than nothing. This is btw also true for the rest of the patch [0] as all my questions in the LongDesc thread [1] remained unanswered… Best regards / Mit freundlichen Grüßen, David Kalnischkies [0] http://bazaar.launchpad.net/~donkult/apt/sid/revision/1920 and ff. [1] http://lists.debian.org/deity/2009/08/msg00112.html -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#448216: Support 3-letter codes for package description translations in APT (was: Re: Bug#448216: Waiting 2 years ago...)
IMO if an 'xx_YY' file is downloaded APT should *always* also download the corresponding 'xx' file if it exists. Hi again :P Remember that some languages have a iso code 2, then they have 3 letters, as Asturian (ast_ES or ast) :) Thanks. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#448216: Support 3-letter codes for package description translations in APT (was: Re: Bug#448216: Waiting 2 years ago...)
Hi! First of all: Thanks very much for the answers! My language (ast) and the English (en_US) are a good examples for the bug :) You can see the English version of apt-get update: http://launchpadlibrarian.net/34947585/us_lang_example.png And you can see the Asturian version of apt-get update: http://launchpadlibrarian.net/34947533/ast_lang_example.png In the case of ast (Asturian code) it is getting as (American Samoa). Best regards and thanks in advance! On Sat, Feb 6, 2010 at 3:20 PM, Frans Pop elen...@planet.nl wrote: (CCing as I'm not sure if you are subscribed to d-i18n) This is btw also true for the rest of the patch [0] as all my questions in the LongDesc thread [1] remained unanswered. I can answer one part of it. ! And last but not least a question more or less only for the l10n-team: ! APT currently includes a (short) list of languages for which it doesn't ! download the Translation file with the short but with the long code. ! e.g. for an pt_BR local it would download the file Translation-pt_BR ! file instead of even trying to download Translation-pt. On the other ! hand Translation-files like cs_CZ are never touched - apt only tries to ! download the cs file. So what do you think: Should apt try downloading ! long and short always OR short only if long is not available? The ! problematic here would be (while currently looking at the l18n for ! unstable main [0]) that e.g. the very small cs_CZ file would hide the ! larger cs file... (btw: Also a suggestion which whitelist should be used ! would be good, e.g. i think it is unlikely that we get a de_?? in the ! future...) A correct implementation of l10n support automatically falls back to the next best translation. The LANGUAGE environment variable can contain a list of languages to fall back to. For example, Debian Installer sets this as follows by default: - for Portuguese: LANGUAGE=pt:pt_BR:en - for Brazilian: LANGUAGE=pt_BR:pt:en Note that they are defined as fallbacks for eachother! And for some Scandinavian languages it's even more fun: - for Northern Sami: LANGUAGE=se_NO:nb_NO:nb:no_NO:no:nn_NO:nn:da:sv:en I think, but I'm not sure, that if LANGUAGE is *not* set, an automatic fallback from e.g. cs_CZ to cs will happen (and maybe even if it is set). It looks as if the APT download implementation has wanted to simplify this, or has maybe just wanted to limit downloads. IMO if an 'xx_YY' file is downloaded APT should *always* also download the corresponding 'xx' file if it exists. And I think that it would probably also be good if APT downloaded at least the first two or three languages listed in LANGUAGES (again both 'xx_YY' and 'xx' files for each). This would ensure that e.g. for Portuguese, Brazilian is available as fallback. English should of course always be downloaded. Cheers, FJP -- To UNSUBSCRIBE, email to debian-i18n-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#448216: Support 3-letter codes for package description translations in APT (was: Re: Bug#448216: Waiting 2 years ago...)
On Saturday 06 February 2010, Marcos wrote: Remember that some languages have a iso code 2, then they have 3 letters, as Asturian (ast_ES or ast) :) Sure. My xx, xx_YY examples should be read to also cover xxx and xxx_YY. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#448216: Support 3-letter codes for package description translations in APT (was: Re: Bug#448216: Waiting 2 years ago...)
David Kalnischkies wrote: 2010/2/6 Frans Pop elen...@planet.nl: A correct implementation of l10n support automatically falls back to the next best translation. The LANGUAGE environment variable can contain a list of languages to fall back to. Never saw this colon-syntax for fallbacks before, but a quick test suggests that the LC_MESSAGES variable which APT uses currently to get the language doesn't support this syntax? Correct. But if you set MESSAGES and then run 'locale' you will see that it is listed besides all the LC_* variables, which shows that it is part of the official l18n system. APT tries to detect which language to download by inspecting LC_MESSAGES extract long (de_DE) and short (de) languagecode. The current code APT would download de in this case, but de_DE if it is defined in the whitelist. My patch currently downloads (in LANG=de_DE) en and de unconditional and de_DE if it would be included in the whitelist (plus whatever Acquire::Languages lists as well). OK. That's a reasonable start. But a whitelist, besides potentially ignoring the user's preferences, is a rather unmaintainable solution in the long run: it will always lag behind translation efforts and translators will probably not even be aware that they might need to request an addition to the whitelist. So additional needed is that APT switches to LANGUAGE and supports colon? No, it cannot switch. It should consider LANGUAGE *in addition to* LC_MESSAGES. Remember LANGUAGE isn't always set. The only problem i can see with this is, that the acquire method is currently thick as a brick: It doesn't check if the file is listed in corresponding Index file (this index isn't even downloaded) and That's bad. IMHO it really should download the index and parse it for supported translations. IIUC the current system means 404s for any language that doesn't have translated descriptions. I would guess that's most of the languages supported in Debian Installer... will try to download all translations, so - for Northern Sami: LANGUAGE=se_NO:nb_NO:nb:no_NO:no:nn_NO:nn:da:sv:en would generate 10 requests (for every component) resulting in 6 with a 404 response (if we assume LongDesc becomes real and therefore en exists). That's why I suggested taking only the first 2 or 3 from LANGUAGE, not the whole list. It's reasonable IMO to compromise between download needed and using the whole list, but it's not reasonable to completely ignore fallbacks the user has defined. I don't know if this has visible side effects beside being silly, but i believe this will be unfixable for APT without a (more or less) rewrite of the acquire system (as Translations seems to be implemented as a hack in the current version already) and that this will not happen for squeeze… (not even started). Understandable. But it would be good to have a description of a proper implementation on the ToDo list. So as ugly as this whitelist is i guess we need it to save the mirrors from a lot of silly requests ~ luckily in a stable release the list shouldn't vary to much… ? I would expect not. Cheers, FJP -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#448216: Support 3-letter codes for package description translations in APT (was: Re: Bug#448216: Waiting 2 years ago...)
Quoting Marcos (marcoscosta...@gmail.com): Hi! Is more complicate this bug? We're waiting more of 2 years ago. Best regards! APT has tons of bugs and very few people taking care of it. It is quite likely that this bug is easy to fixfor someone who knows about C code. I agree that it may be frustrating for the Asturian translator to be able to do package descriptions translationsbut not use it. Would anyone in the i18n crowd volunteer to look at APT source code and try providing a patch? I'm sure that APT maintainers would quickly include it. signature.asc Description: Digital signature