On Wed, Apr 15, 2026 at 06:35:02PM +0200, Patrice Dumas wrote:
> > I think it's okay to add a command for specifying language sub-variant,
> > as long as the complexity of referencing the language is limited to that
> > one language.  As before:
> > 
> > @documentlanguage sr
> > @documentlanguagevariant ekavsk
> > @documentscript latin
> > 
> > Only the "@documentlanguagevariant" command would need to be defined in
> > terms of BCP 47 or the IANA language subtag registry.
> 
> Not in term of BCP 47, but in term of IANA language variants subtag registry.
> 
> We also use the IANA registry for @documentlanguage as it is
> consolidated here and up to date, the ISO standards are not open
> standards and harder to get updated and in a format easy to parse.

...

> > If there was a need for more than one variant, this could be given as
> > a comma-separated list on the @documentlanguagevariant line: like the
> > argument to @example.
> 
> There is such a need.  What you propose seems good to me, quite Texinfo-ish.

As I understand, the IETF language subtag registry defines which "variant" 
subtags
may be used in combination with other "variant" subtags.

For example, the "grclass" subtag you used as an example is more
specific than variants of Occitan and occurs later in the BCP 47 language
identifier.  This is shown by the "Prefix:" entries in the register:

Type: variant
Subtag: grclass
Description: Classical Occitan orthography
Added: 2018-04-22
Prefix: oc
Prefix: oc-aranes
Prefix: oc-auvern
Prefix: oc-cisaup
Prefix: oc-creiss
Prefix: oc-gascon
Prefix: oc-lemosin
Prefix: oc-lengadoc
Prefix: oc-nicard
Prefix: oc-provenc
Prefix: oc-vivaraup
Comments: Classical written standard for Occitan developed in 1935 by
  Alibèrt

"oc-legnadoc-grclass" would be denoted in a Texinfo input file thus:

@documentlanguage oc
@documentlanguagevariant lengadoc, grclass

The other order, "@documentlanguagevariant grclass, lengadoc" would have
to be considered incorrect, as "lengadoc" is only allowed under the "oc-"
prefix, not under "oc-grclass-":

Type: variant
Subtag: lengadoc
Description: Languedocien
Added: 2018-04-22
Prefix: oc
Comments: Occitan variant spoken in Languedoc

- (although it's uncertain whether texi2any should do the work to validate
such usages).

I prefer @documentlanguagevariant to @documentlanguagevariants, as it
is fine in the singular with a list, whereas in the plural with a single
item in the argument would look strange.

If I understand correctly, the argument to @documentlanguagevariant
only accepts tags that are entered as "Type: variant" in the IANA
registry.  We do not accept region codes, scripts, or the "extension"
subtags of BCP 47.  Notably, we do not accept "-u-sd-" extensions
used to denote regional variants.  (Wikipedia gives "gsw-u-sd-chzh"
as an example denoting "Swiss German as used in the Canton of Zurich".)

(This -u- extension is governed by yet another entity, the Unicode
Consortium.  There is information on it here:
https://www.unicode.org/reports/tr35/#u_Extension.)

So we can reference the IANA subtag registry in a simple way, without
accepting the full complexity and scope of possibilities of BCP 47.




Reply via email to