[
   * starting a new thread,
   * summing up the current state of German patterns,
   * long
]

Am 03.05.2010 20:06, schrieb Manuel Pégourié-Gonnard:
Le 03/05/2010 11:08, Mojca Miklavec a écrit :

The down side is, when german-x is updated, hyph-utf8 needs to
be updated too

Whet german-x is updated, they'll probably want to update patterns
in hyph-utf8 anyway.

The actual patterns are in hyph-utf8? I was under the impression
that dehyph-exptl was a separate package, both on CTAN and in TL.

Everything said is correct.

In theory, dehyph-exptl is a separate package intended for people eager to play with the experimental patterns. Up until now, the package (the pattern providers) were only aiming at 8-bit TeX, though.

Practically, the patterns are already part of hyph-utf8. During the great pattern encoding normalisation efforts, Mojca decided to make our patterns the default ones for German language in 16-bit engines. While the idea conflating two backwards compatibility breaking events (modified paragraph breaking in LuaTeX and new German patterns) into one event only is great, this is a bit ahead of our original intention. I at least, just haven't put too much thought into 16-bit engines so far.


For TeX Live 2009, the current state is this:

   * Werner has converted the patterns to UTF-8.  He also provided a
     .tex pattern wrapper, that converts patterns back into T1
     encoding, if an 8-bit engines is recognized.  8-bit TeX engines
     require the dehyph-exptl package with their time-stamped patterns
     to be installed for this.  This package is already part of TL2009
     and the patterns are enabled in language.dat, by default, making
     them available via languages 'german-x-2009-06-19' etc.

   * XeTeX loads the unmodified UTF-8 patterns provided by hyph-utf8,
     which are the same as in dehyph-exptl, but uses its own pattern
     wrapper, though.  As long as the patterns for XeTeX and LuaTeX
     aren't frozen, this means hyph-utf8 is updated whenever we provide
     new patterns.

   * As for LuaTeX, I have no idea what patterns LuaTeX from TL2009
     actually loads.

Note, the pattern wrappers provided by dehyph-exptl and hyph-utf8 use quite similar code, but do different things:

   hyph-utf8:  used for languages 'german' and 'ngerman'

     if (engine == 8bit) then load traditional patterns
     else load experimental patterns

     No re-encoding in both cases.

   dehyph-exptl:  used for languages 'german-x-<date>' etc.

     if (engine == 8bit) then re-encode patterns to T1 encoding
     else load patterns unmodified

I have just one question about this procedure. Is the code for the 8/16-bit engines switch ok or are there better alternatives (\ifxetex etc.)? This is from loadhyph-de-1996.tex:

\begingroup
% Test whether we received one or two arguments
\def\testengine#1#2!{\def\secondarg{#2}}
% That's Tau (as in Taco or ΤΕΧ, Tau-Epsilon-Chi), a 2-byte UTF-8 character
\testengine Τ!\relax
% Unicode-aware engine (such as XeTeX or LuaTeX) only sees a single (2-byte) 
argument
\ifx\secondarg\empty
    \message{UTF-8 German Hyphenation Patterns (Reformed Orthography)}
    \input hyph-de-1996.tex
\else
    \message{German Hyphenation Patterns (Reformed Orthography)}
    % Kept for the sake of backward compatibility, but newer and better 
patterns by WL are available.
    \input dehyphn.tex
\fi
\endgroup


For TeX Live 2010, I hope we can agree on the following goals:

   * 8-bit TeX
       No change (ever).  Load traditional patterns in the format.
       Experimental patterns are provided by package dehyph-exptl.

   * XeTeX
       No change.  Load experimental patterns, by default.  Make them
       available as traditional languages 'german' and 'ngerman'.

   * LuaTeX:
       Load experimental patterns from new language.dat.lua, by default.
       Make them available as traditional languages 'german' and
       'ngerman'.


What does that mean for German patterns?  Not much, fortunately:

   * For 8-bit TeX, language.dat has correct entries for the following
     languages:

        german,
        ngerman,

        and

        german-x-2009-06-19,
        german-x-latest,
        ngerman-x-2009-06-19,
        ngerman-x-latest.

   * Besides my question about the 8/16-bit switch from above, all is
     well with XeTeX, as well.

   * LuaTeX loads languages from language.dat.lua.  A proper entry is
     required there and a plain text version of the pattern files.
     Since our patterns have been there in hyph-utf8, Mojca has already
     done all the necessary work.

     The question is, whether there should be an entry for languages
     'german-x-<date>' etc.  I'd say no and I'll emphasize in our
     documentation, that package dehyph-exptl is not required for
     LuaTeX (and XeTeX).

     The only thing for us to do, is to remember to provide patterns as
     text versions, too, in future releases.  Did I miss something?


I'm sorry for the confusion about the state of German patterns. There must have been some. At least, I have learnt much about pattern loading during the last weeks. Comments and corrections are welcome (hence the lengthy mail)!

Best regards,
Stephan Hennig

Reply via email to