> I have been working on an international standard for hyphenation pattern > definitions. The draft can be found here > https://github.com/OpenTaal/hyphenation-definitions For that I have been > collecting many real world examples for which patgen is not suitable. > This will form the specifications for a next generation patgen. A > community on whos work this RFC is based is already working better > hyphenation. If you need more info, pleaes contact me.
Thank you very much, this is really interesting! A few remarks: I don't believe it's very accurate to say that only Babel and Polyglossia will be affected by new types of patterns, the main change will be in the engine, and it will certainly be only in XeTeX and LuaTeX, Babel and Polyglossia just tell the engine to use a different set of patterns, they don't really know anything about hyphenation, and it's even possible that they won't have to change anything. But that's not really important. Also, the current problem (hyphenating "co-mi-da" but not "come") cannot really be solved at hyphenation definition level. Something I didn't understand is how many different weights you can have in a word in an hyphenation definition? Two is clearly possible in your document, but it's not clear if more is allowed, nor if there is an upper limit, the document would be clearer if this is precised. Is there a pattern definition corresponding to this new hyphenation definition? Or could it be translated to the current pattern format? If I understand correctly, from TeX point of view, this would mean adding the following penalties: - \hyphenpenaltyweighttwo - \hyphenpenaltyweightthree - \hyphenpenaltyweightfour - \hyphenpenaltyweightfive - \hyphenpenaltysuffix - \hyphenpenaltyprefix - \hyphenpenaltycompound - \hyphenpenaltycompoundsuffix - \hyphenpenaltycompoundprefix - \hyphenpenaltycompoundinterfix - \hyphenpenaltyunfavorable ? But this would make sense only if there was a pattern definition that TeX could read of course. Thank you, -- Elie
