Hi, Now that the format and meaning of language.dat.lua seems stable, it's probably time to decide how to handle it. Here is a brief summary of the situation:
1. For maximal safety, we use a plain text version of patterns and exceptions for dynamic loading; for this we need information not contained in language.(dat|def), namely the nae of the files containing those versions. 2. This information is looked for in a file language.dat.lua. (a) Languages without entry in this file are dumped in the format the plain old way (Knuthian usenglish is always dumped as \language0). (b) It is possible to disable dynamic loading (hence loading at all) of particular language via entries in this file. 3. Currently, languages loadable are (a subset of) those declared in language.{dat,def}. 4. But in the future, one can imagine languages having an entry language.dat.lua only, hence being only dynamically loadable in LuaTeX (macro support yet to be written, but I have ideas for that, should not be difficult now) without being dumped in other (non-LuaTeX-based) formats. A special property of language.dat.lua compared to the other files is, it never hurts to have more languages in this file than the user wants activated, due to point 3 (and the new dynamic mechanism). An incomplete language.dat.lua doesn't hurt too much either, due to 2a. Now, to the best of my knowledge, entries in language.{dat,def} basically come from three souces: (a) package hyphen-base (b) tex-hyphen (hyph-utf8) (c) german-x Since the first one is basically frozen, and (b) and (c) are very cooperative, it is probably possible to use a monolithic language.dat.lua, shipped with hyph-utf8, using information from their repository and from german-x. The up side is, there is no need to change anything on the TL side. The down side is, when german-x is updated, hyph-utf8 needs to be updated too, and it'll become more compilcated if there ever is more actors. Another possibility is to handle language.dat.lua in the same way we handle language.{dat,def} in TL currently. It would only require new (optional) attributes for the AddHyphen postaction, and the code to handle it of course. Pro: more modular and scalable. Con: needs coding. New attributes would be: patterns=<file with plain text patterns>, hyphenation=<file with plain text exceptions>, special=<code for special languages> (optional), and something to dtermine if the language should go to language.{dat,def} only, language.dat.lua only, or both. An intermediate possibility is to use a monolithic language.dat.lua for now, since it is readily available, and implement the more modular option later. (Pro: nothing to do now, con: now would be the best moment for me to implement that, since later I'll have to remember things first.) Wdyt? Manuel.