Hello,

I work for a religious organization that produces publications in several 
languages spoken by our members throughout the world.  Over the years, we have 
developed UTF-8 encoded hyphenated word lists for 91 different languages.  We 
use these word lists to create proprietary hyphenation software.  We would like 
to use these lists to create hyphenation pattern files that can be used with 
more traditional software such as TeX and OpenOffice applications.

It appears that hyphenation pattern files are being created by patgen using 
tokenized word lists then converting the final output to UTF-8.  Unfortunately, 
we are dealing with some complex languages that will exceed the 256 character 
limit of patgen.

Like others, I have unsuccessfully tried to build opatgen with the current 
version of gcc.   Trying to find gcc version 2.96 in hopes that it will work 
doesn't make sense, especially when there are reports that opatgen has some 
serious reliability and performance issues.  I applaud David Antos for his 
research and development of opatgen and find it fascinating that his work has 
not been adopted and enhanced by the open source community.

Is using patgen with tokenized word lists and converting the output to UTF-8 
really the only viable way to create pattern files?

Steve Dickson
The Church of Jesus Christ of Latter-day Saints
Publishing Services Department
50 East North Temple Street
Salt Lake City, Utah  84150
Email: [email protected]<mailto:[email protected]>



 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.


Reply via email to