Re: [gentoo-dev] Using LINGUAS
Hello, LINGUAS is a concept in gettext tooling. I do not understand why we overload it in package management in the first place. It is an environment variable that we set up in make.conf, because that's an easy way to get it into the build environment to have the standard way of limiting translations work. By overloading it for IUSE_EXPAND we effectively make it pretty much impossible to have the choice of ALL translation files, except when it means extra packages; without conditional LINGUAS setting, that is. The standard LINGUAS variable acts as follows: If unset: Build all translations If set to an UNORDERED listing of language codes: Include translations for listed languages (or dialects) If set to an empty string or similar: Don't include any translations We currently have wrong behaviour for when it's unset, as as far as IUSE_EXPAND is concerned - we don't have a default that includes all available linguas as far as I know. Though in the real world, I don't think it matters much, and it's convenient for those that just build a gentoo machine for use within the family, with known language capabilities within. As a side note: LINGUAS does not only control which .mo files happen to be installed (which you could get rid of later easily with localepurge) - it also is used to filter out unwanted translations in files which have all the translations in the same file; this includes, but is not limited to .desktop files. This used to be a intltool thing, but nowadays gettext has derived such support directly as well. Mart
Re: [gentoo-dev] Using LINGUAS
On Mon, 21 Jul 2014 13:23:46 +0900 Thomas Kahle to...@gentoo.org wrote: the OCR software tesseract has many different plugins for language packs used for OCR for different languages. The ebuild uses the LINGUAS variable to pass the choice of which packages to install to the user. Every ebuild uses LINGUAS implicitly. What you mean is that you expand LINGUAS to USE Flags... A reverse dependency is app-text/pdfsandwich which roughly puts OCR'ed text in a scanned pdf. Since it uses tesseract it supports exactly those languages that tesseract supports. Should its ebuild have LINGUAS use flags and then depend on tesseract with at least those flags set? ... in which case you can simply use USE dependencies since you have USE-expanded flags matching LINGUAS. It should be easy to match one ebuild's IUSE with another's. linguas_tlh? ( app-text/tesseract[linguas_tlh] ) jer
Re: [gentoo-dev] Using LINGUAS
On 21/07/14 21:03, Jeroen Roovers wrote: On Mon, 21 Jul 2014 13:23:46 +0900 Thomas Kahle to...@gentoo.org wrote: the OCR software tesseract has many different plugins for language packs used for OCR for different languages. The ebuild uses the LINGUAS variable to pass the choice of which packages to install to the user. Every ebuild uses LINGUAS implicitly. What you mean is that you expand LINGUAS to USE Flags... Yes, I did not use the correct terminology. A reverse dependency is app-text/pdfsandwich which roughly puts OCR'ed text in a scanned pdf. Since it uses tesseract it supports exactly those languages that tesseract supports. Should its ebuild have LINGUAS use flags and then depend on tesseract with at least those flags set? ... in which case you can simply use USE dependencies since you have USE-expanded flags matching LINGUAS. It should be easy to match one ebuild's IUSE with another's. linguas_tlh? ( app-text/tesseract[linguas_tlh] ) I know how to specify USE dependcies. Since you deleted it, let me ask my question again: If I follow this method I will have 37 dependencies all of this form. This is pointless because a) Everytime tesseract gains or loses a language support (it does happen!) the pdfsandwich ebuild needs to be updated, b) Nobody wants to set different linguas on tesseract and pdfsandwich anyway. Cheers, Thomas -- Thomas Kahle http://dev.gentoo.org/~tomka/ signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Using LINGUAS
On Mon, 21 Jul 2014 21:26:09 +0900 Thomas Kahle to...@gentoo.org wrote: Since you deleted it sorry , let me ask my question again: If I follow this method I will have 37 dependencies all of this form. This is pointless because a) Everytime tesseract gains or loses a language support (it does happen!) the pdfsandwich ebuild needs to be updated, b) Nobody wants to set different linguas on tesseract and pdfsandwich anyway. Sounds like it's useful, not pointless at all. And you can have the ebuilds automatically generate those dependencies. It's a lot less work if you keep all the linguas in a variables and loop over it to generate IUSE+= and a dependency list. jer
Re: [gentoo-dev] Using LINGUAS
Dnia 2014-07-21, o godz. 13:23:46 Thomas Kahle to...@gentoo.org napisał(a): the OCR software tesseract has many different plugins for language packs used for OCR for different languages. The ebuild uses the LINGUAS variable to pass the choice of which packages to install to the user. A reverse dependency is app-text/pdfsandwich which roughly puts OCR'ed text in a scanned pdf. Since it uses tesseract it supports exactly those languages that tesseract supports. Do I understand correctly that pdfsandwich doesn't have any explicit switches for language support? In other words, adding support for another language requires rebuilding tesseract and not pdfsandwich? Should its ebuild have LINGUAS use flags and then depend on tesseract with at least those flags set? While it seems consistent to put the LINGUAS choice in the most user facing package, in this case I would actually not put it in here. It would introduces a point of failure and maintenance work for the each tesseract upgrade (since the language set slightly changes from time to time). A typical user would set LINGUAS in her make.conf anyway. In this case the same choice applies to both packages anyway. Maybe an einfo is sufficient to inform the user it? I have no idea where did you get the 'most user facing' idea from but this is not really true or useful. The whole idea of libraries like imagemagick is about hiding unnecessary dependencies under single interface -- now imagine every package using imagemagick declaring flags for all the formats supported by it... If pdfsandwich itself doesn't do anything with LINGUAS, don't declare it. The rule about USE flags not doing anything applies here. Moreover, LINGUAS are usually set globally so scope is not really an issue here. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Using LINGUAS
Hi, On 21/07/14 21:42, Michał Górny wrote: Dnia 2014-07-21, o godz. 13:23:46 Thomas Kahle to...@gentoo.org napisał(a): the OCR software tesseract has many different plugins for language packs used for OCR for different languages. The ebuild uses the LINGUAS variable to pass the choice of which packages to install to the user. A reverse dependency is app-text/pdfsandwich which roughly puts OCR'ed text in a scanned pdf. Since it uses tesseract it supports exactly those languages that tesseract supports. Do I understand correctly that pdfsandwich doesn't have any explicit switches for language support? In other words, adding support for another language requires rebuilding tesseract and not pdfsandwich? Exactly, pdfsandwich combines tesseract with some postprocessing that is not language specific. Should its ebuild have LINGUAS use flags and then depend on tesseract with at least those flags set? While it seems consistent to put the LINGUAS choice in the most user facing package, in this case I would actually not put it in here. It would introduces a point of failure and maintenance work for the each tesseract upgrade (since the language set slightly changes from time to time). A typical user would set LINGUAS in her make.conf anyway. In this case the same choice applies to both packages anyway. Maybe an einfo is sufficient to inform the user it? I have no idea where did you get the 'most user facing' idea from but this is not really true or useful. The whole idea of libraries like imagemagick is about hiding unnecessary dependencies under single interface -- now imagine every package using imagemagick declaring flags for all the formats supported by it... If I don't know anything about tesseract but only install pdfsandwich and then try to scan japanese it won't work out of the box. How should the user know that she has to put japanese in ther LINGUAS variable and rebuild tesseract afterwards? Probably a simple einfo in pdfsandwich should do it. If pdfsandwich itself doesn't do anything with LINGUAS, don't declare it. The rule about USE flags not doing anything applies here. Moreover, LINGUAS are usually set globally so scope is not really an issue here. I agree. Cheers, Thomas -- Thomas Kahle http://dev.gentoo.org/~tomka/ signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Using LINGUAS
On 21/07/14 12:23 AM, Thomas Kahle wrote: Hi, the OCR software tesseract has many different plugins for language packs used for OCR for different languages. The ebuild uses the LINGUAS variable to pass the choice of which packages to install to the user. A reverse dependency is app-text/pdfsandwich which roughly puts OCR'ed text in a scanned pdf. Since it uses tesseract it supports exactly those languages that tesseract supports. Should its ebuild have LINGUAS use flags and then depend on tesseract with at least those flags set? While it seems consistent to put the LINGUAS choice in the most user facing package, in this case I would actually not put it in here. It would introduces a point of failure and maintenance work for the each tesseract upgrade (since the language set slightly changes from time to time). A typical user would set LINGUAS in her make.conf anyway. In this case the same choice applies to both packages anyway. Maybe an einfo is sufficient to inform the user it? Cheers, Thomas there are two possible scenarios here. 1. the dependency is COMPILE TIME (ABI, API, whatever). in this scenario, the depender *must* have appropriate LINGUAS, even if that means copying and pasting from the dependee. this is necessary for correct rebuilding, and everything else associated with automagic deps. 2. the dependency is RUN TIME. in this scenario, the case is the same with all other runtime USE dependencies; that is to say, the correct solution is USE_RUNTIME or something along those lines. [0] here, I would say that einfo is superior to copying IUSE, since these flags should be set globally anyways to make sense. [0] please no bikeshedding on whether to call it RUNTIME_USE or ǝsn‾ǝɯıʇunɹ. signature.asc Description: OpenPGP digital signature
[gentoo-dev] Using LINGUAS
Hi, the OCR software tesseract has many different plugins for language packs used for OCR for different languages. The ebuild uses the LINGUAS variable to pass the choice of which packages to install to the user. A reverse dependency is app-text/pdfsandwich which roughly puts OCR'ed text in a scanned pdf. Since it uses tesseract it supports exactly those languages that tesseract supports. Should its ebuild have LINGUAS use flags and then depend on tesseract with at least those flags set? While it seems consistent to put the LINGUAS choice in the most user facing package, in this case I would actually not put it in here. It would introduces a point of failure and maintenance work for the each tesseract upgrade (since the language set slightly changes from time to time). A typical user would set LINGUAS in her make.conf anyway. In this case the same choice applies to both packages anyway. Maybe an einfo is sufficient to inform the user it? Cheers, Thomas -- Thomas Kahle http://dev.gentoo.org/~tomka/ signature.asc Description: OpenPGP digital signature