Re: [gentoo-dev] Using LINGUAS

2014-07-22 Thread Mart Raudsepp
Hello,

LINGUAS is a concept in gettext tooling. I do not understand why we
overload it in package management in the first place.
It is an environment variable that we set up in make.conf, because
that's an easy way to get it into the build environment to have the
standard way of limiting translations work.

By overloading it for IUSE_EXPAND we effectively make it pretty much
impossible to have the choice of ALL translation files, except when it
means extra packages; without conditional LINGUAS setting, that is.


The standard LINGUAS variable acts as follows:

If unset: Build all translations
If set to an UNORDERED listing of language codes: Include translations
for listed languages (or dialects)
If set to an empty string or similar: Don't include any translations


We currently have wrong behaviour for when it's unset, as as far as
IUSE_EXPAND is concerned - we don't have a default that includes all
available linguas as far as I know.


Though in the real world, I don't think it matters much, and it's
convenient for those that just build a gentoo machine for use within the
family, with known language capabilities within.


As a side note: LINGUAS does not only control which .mo files happen to
be installed (which you could get rid of later easily with localepurge)
- it also is used to filter out unwanted translations in files which
have all the translations in the same file; this includes, but is not
limited to .desktop files.
This used to be a intltool thing, but nowadays gettext has derived such
support directly as well.


Mart




Re: [gentoo-dev] Using LINGUAS

2014-07-21 Thread Jeroen Roovers
On Mon, 21 Jul 2014 13:23:46 +0900
Thomas Kahle to...@gentoo.org wrote:

 the OCR software tesseract has many different plugins for
 language packs used for OCR for different languages.  The ebuild
 uses the LINGUAS variable to pass the choice of which packages to
 install to the user.

Every ebuild uses LINGUAS implicitly. What you mean is that you
expand LINGUAS to USE Flags...

 A reverse dependency is app-text/pdfsandwich which roughly puts
 OCR'ed text in a scanned pdf.  Since it uses tesseract it
 supports exactly those languages that tesseract supports.
 
 Should its ebuild have LINGUAS use flags and then depend on
 tesseract with at least those flags set?

... in which case you can simply use USE dependencies since you have
USE-expanded flags matching LINGUAS. It should be easy to match one
ebuild's IUSE with another's.

linguas_tlh? ( app-text/tesseract[linguas_tlh] )


 jer



Re: [gentoo-dev] Using LINGUAS

2014-07-21 Thread Thomas Kahle
On 21/07/14 21:03, Jeroen Roovers wrote:
 On Mon, 21 Jul 2014 13:23:46 +0900
 Thomas Kahle to...@gentoo.org wrote:
 
 the OCR software tesseract has many different plugins for
 language packs used for OCR for different languages.  The ebuild
 uses the LINGUAS variable to pass the choice of which packages to
 install to the user.
 
 Every ebuild uses LINGUAS implicitly. What you mean is that you
 expand LINGUAS to USE Flags...

Yes, I did not use the correct terminology.

 A reverse dependency is app-text/pdfsandwich which roughly puts
 OCR'ed text in a scanned pdf.  Since it uses tesseract it
 supports exactly those languages that tesseract supports.

 Should its ebuild have LINGUAS use flags and then depend on
 tesseract with at least those flags set?
 
 ... in which case you can simply use USE dependencies since you have
 USE-expanded flags matching LINGUAS. It should be easy to match one
 ebuild's IUSE with another's.
 
 linguas_tlh? ( app-text/tesseract[linguas_tlh] )

I know how to specify USE dependcies.

Since you deleted it, let me ask my question again: If I follow
this method I will have 37 dependencies all of this form.  This
is pointless because

a) Everytime tesseract gains or loses a language support (it does
happen!) the pdfsandwich ebuild needs to be updated,

b) Nobody wants to set different linguas on tesseract and
pdfsandwich anyway.

Cheers,
Thomas


-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] Using LINGUAS

2014-07-21 Thread Jeroen Roovers
On Mon, 21 Jul 2014 21:26:09 +0900
Thomas Kahle to...@gentoo.org wrote:

 Since you deleted it

sorry

, let me ask my question again: If I follow
 this method I will have 37 dependencies all of this form.  This
 is pointless because
 
 a) Everytime tesseract gains or loses a language support (it does
 happen!) the pdfsandwich ebuild needs to be updated,
 
 b) Nobody wants to set different linguas on tesseract and
 pdfsandwich anyway.

Sounds like it's useful, not pointless at all. And you can have the
ebuilds automatically generate those dependencies. It's a lot less work
if you keep all the linguas in a variables and loop over it to generate
IUSE+= and a dependency list.


 jer



Re: [gentoo-dev] Using LINGUAS

2014-07-21 Thread Michał Górny
Dnia 2014-07-21, o godz. 13:23:46
Thomas Kahle to...@gentoo.org napisał(a):

 the OCR software tesseract has many different plugins for
 language packs used for OCR for different languages.  The ebuild
 uses the LINGUAS variable to pass the choice of which packages to
 install to the user.
 
 A reverse dependency is app-text/pdfsandwich which roughly puts
 OCR'ed text in a scanned pdf.  Since it uses tesseract it
 supports exactly those languages that tesseract supports.

Do I understand correctly that pdfsandwich doesn't have any explicit
switches for language support? In other words, adding support for
another language requires rebuilding tesseract and not pdfsandwich?

 Should its ebuild have LINGUAS use flags and then depend on
 tesseract with at least those flags set?
 
 While it seems consistent to put the LINGUAS choice in the most
 user facing package, in this case I would actually not put it in
 here.  It would introduces a point of failure and maintenance
 work for the each tesseract upgrade (since the language set
 slightly changes from time to time).  A typical user would set
 LINGUAS in her make.conf anyway.  In this case the same choice
 applies to both packages anyway.  Maybe an einfo is sufficient to
 inform the user it?

I have no idea where did you get the 'most user facing' idea from but
this is not really true or useful. The whole idea of libraries like
imagemagick is about hiding unnecessary dependencies under single
interface -- now imagine every package using imagemagick declaring
flags for all the formats supported by it...

If pdfsandwich itself doesn't do anything with LINGUAS, don't declare
it. The rule about USE flags not doing anything applies here. Moreover,
LINGUAS are usually set globally so scope is not really an issue here.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] Using LINGUAS

2014-07-21 Thread Thomas Kahle
Hi,

On 21/07/14 21:42, Michał Górny wrote:
 Dnia 2014-07-21, o godz. 13:23:46
 Thomas Kahle to...@gentoo.org napisał(a):
 
 the OCR software tesseract has many different plugins for
 language packs used for OCR for different languages.  The ebuild
 uses the LINGUAS variable to pass the choice of which packages to
 install to the user.

 A reverse dependency is app-text/pdfsandwich which roughly puts
 OCR'ed text in a scanned pdf.  Since it uses tesseract it
 supports exactly those languages that tesseract supports.
 
 Do I understand correctly that pdfsandwich doesn't have any explicit
 switches for language support? In other words, adding support for
 another language requires rebuilding tesseract and not pdfsandwich?

Exactly, pdfsandwich combines tesseract with some postprocessing
that is not language specific.

 Should its ebuild have LINGUAS use flags and then depend on
 tesseract with at least those flags set?

 While it seems consistent to put the LINGUAS choice in the most
 user facing package, in this case I would actually not put it in
 here.  It would introduces a point of failure and maintenance
 work for the each tesseract upgrade (since the language set
 slightly changes from time to time).  A typical user would set
 LINGUAS in her make.conf anyway.  In this case the same choice
 applies to both packages anyway.  Maybe an einfo is sufficient to
 inform the user it?
 
 I have no idea where did you get the 'most user facing' idea from but
 this is not really true or useful. The whole idea of libraries like
 imagemagick is about hiding unnecessary dependencies under single
 interface -- now imagine every package using imagemagick declaring
 flags for all the formats supported by it...

If I don't know anything about tesseract but only install
pdfsandwich and then try to scan japanese it won't work out of
the box.  How should the user know that she has to put japanese
in ther LINGUAS variable and rebuild tesseract afterwards?

Probably a simple einfo in pdfsandwich should do it.

 If pdfsandwich itself doesn't do anything with LINGUAS, don't declare
 it. The rule about USE flags not doing anything applies here.
 Moreover, LINGUAS are usually set globally so scope is not
 really an issue here.

I agree.

Cheers,
Thomas



-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] Using LINGUAS

2014-07-21 Thread Alex Xu
On 21/07/14 12:23 AM, Thomas Kahle wrote:
 Hi,
 
 the OCR software tesseract has many different plugins for
 language packs used for OCR for different languages.  The ebuild
 uses the LINGUAS variable to pass the choice of which packages to
 install to the user.
 
 A reverse dependency is app-text/pdfsandwich which roughly puts
 OCR'ed text in a scanned pdf.  Since it uses tesseract it
 supports exactly those languages that tesseract supports.
 
 Should its ebuild have LINGUAS use flags and then depend on
 tesseract with at least those flags set?
 
 While it seems consistent to put the LINGUAS choice in the most
 user facing package, in this case I would actually not put it in
 here.  It would introduces a point of failure and maintenance
 work for the each tesseract upgrade (since the language set
 slightly changes from time to time).  A typical user would set
 LINGUAS in her make.conf anyway.  In this case the same choice
 applies to both packages anyway.  Maybe an einfo is sufficient to
 inform the user it?
 
 Cheers,
 Thomas
 

there are two possible scenarios here.

1. the dependency is COMPILE TIME (ABI, API, whatever). in this
scenario, the depender *must* have appropriate LINGUAS, even if that
means copying and pasting from the dependee. this is necessary for
correct rebuilding, and everything else associated with automagic deps.

2. the dependency is RUN TIME. in this scenario, the case is the same
with all other runtime USE dependencies; that is to say, the correct
solution is USE_RUNTIME or something along those lines. [0] here, I
would say that einfo is superior to copying IUSE, since these flags
should be set globally anyways to make sense.


[0] please no bikeshedding on whether to call it RUNTIME_USE or ǝsn‾ǝɯıʇunɹ.



signature.asc
Description: OpenPGP digital signature


[gentoo-dev] Using LINGUAS

2014-07-20 Thread Thomas Kahle
Hi,

the OCR software tesseract has many different plugins for
language packs used for OCR for different languages.  The ebuild
uses the LINGUAS variable to pass the choice of which packages to
install to the user.

A reverse dependency is app-text/pdfsandwich which roughly puts
OCR'ed text in a scanned pdf.  Since it uses tesseract it
supports exactly those languages that tesseract supports.

Should its ebuild have LINGUAS use flags and then depend on
tesseract with at least those flags set?

While it seems consistent to put the LINGUAS choice in the most
user facing package, in this case I would actually not put it in
here.  It would introduces a point of failure and maintenance
work for the each tesseract upgrade (since the language set
slightly changes from time to time).  A typical user would set
LINGUAS in her make.conf anyway.  In this case the same choice
applies to both packages anyway.  Maybe an einfo is sufficient to
inform the user it?

Cheers,
Thomas



-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/



signature.asc
Description: OpenPGP digital signature