I just uploaded new experimental hyphenation patterns for FOP, see http://sourceforge.net/projects/offo, select the tab files, select the newest files, or the files in offo-hyphenation-utf8/0.1.
>From the readme file (index.html in the downloaded zip files): Recently the TeX community have converted their hyphenation pattern files to utf-8 format. Most of such pattern files can be trivially converted to pattern files in the XML format used by FOP. Therefore the OFFO maintainer joined the maintainers of the TeX hyphenation patterns, and in the future the hyphenation patterns offered by OFFO will be simple conversions from the TeX patterns. This is the first release of the TeX utf-8 patterns for FOP. There are a few unsolved problems: Naming: FOP uses the POSIX naming convention ll_CC for language and country. There are a couple of patterns that do not fit into this scheme. When a language uses various alternative scripts, the script name is appended to the file name, e.g. sh_Cyrl and sh_Latn. The user will have to rename the pattern file of his preferred script in the jar file by removing the script suffix. The final solution is probably to merge the patterns for different scripts in one pattern file. When a language uses various alternative spelling rules, some descriptive suffix is appended to the file name, e.g. de_1901; users who prefer these pattern files over the default ones will have to rename the pattern files in the jar file. Licenses: No overview of the licenses has yet been made. To find information about the license, one has to look into the comments in the XML or TeX pattern files. Comments: The conversion from TeX to XML is done by a program. Comments provide a problem, because in TeX the trailing new line is part of the comment. In comment sections in XML this is less desirable, and we have done our best to format comments in a legible way. However, at the moment the formatting is spoiled by text data between comments (usually blank lines), and all following comments are on a single line. Classes: The TeX patterns, and therefore also the XML patterns do not contain classes, i.e. a list of characters used in words (Unicode class Letter). Since 3 September 2009 these classes are built into FOP. Therefore these patterns can only be used with FOP versions created after that date. Until now no release was made after that date, and these patterns only work with code from the subversion repository. Not included: There are no separate hyphenation patterns for Norwegian Nynorsk and Norwegian Bokmal. Instead, there is a single pattern file for Norwegian. There are no patterns for esperanto, because the TeX pattern file is not in a format that can be converted to XML. There are no patterns for hungarian, because the TeX pattern file contains too many patterns for my machine to compile (stack overflow). I would appreciate your comments on the usability of these hyphenation patterns. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.eu --------------------------------------------------------------------- To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org