Hi, 2008/3/24 Rene Engelhard <[EMAIL PROTECTED]>: > [ more or less fullquoting for upstream ] > > Hi, > > Steve Wolter wrote: > > after some not-so-pleasant tries to use and then to patch libhyphen > > (back then it was called libhnj) early last year, I've reimplemented > > the algorithm in C++ in a library called libhyphenate. > > > > I deem libhyphenate considerably easier to use. In addition to all > > libhyphen features, it supports system-central storage of hyphenation > > pattern files, hyphenation at all possible hyphenation points, hyphe- > > nation of text such that it optimally fits a given width (in characters) > > and hyphenation using the libhyphen-style hyphens array. > > > > In addition, it fixes the libhyphen TODO of handling UTF-8 characters > > and the not yet filed bug that, for some languages, an hyphenation-free > > zone at the start and end of each word is needed to hyphenate correctly. > > > > If you want to have a look yourself and test this (bold) claim, the > > source code can be found at: > > http://swolter.sdf1.org/libhyphenate_1-current.tar.gz > > > > > In order to avoid having two libraries around doing essentially the > > same thing, I've reimplemented the public libhnj/libhyphen interface > > for libhyphenate. You can find the implementation at: > > http://swolter.sdf1.org/libhyphen-hyphenate-1.0.tar.gz > > > > What do you think of the work? > > I don't think we (as the Debian maintainers) are the persons to decide > whether/when upstream will switch zo libhyphenate. > > I think we should involve upstream (and the author of libhyphen) in this > (Cced) > Lazlo/tl, what you do you think of this?
If I right know, the original LibHnj library has also algorithms for justification, also with variable width characters. You can check it in the LibHnj package of Debian. The aim of the hyphenation development at the Lingucomponent project is developing a competitive hyphenation algorithm and library for OpenOffice.org and other applications. Justification (typesetting paragraphs) is a different task (maybe with more complex problems, for example see this illustrated paper about mathematical typesetting of TeX: http://www.tug.org/TUGboat/Articles/tb27-1/tb86jackowski.pdf). I have checked your code with non-standard hyphenation (only the currect distribution), but it doesn't work for me. Hyphen 2.4 (http://downloads.sourceforge.net/hunspell/hyphen-2.4.tar.gz) has inner hyphenmin support (the main reason of your development), moreover, it has other new features also for better German hyphenation: compound word hyphenation and compound hyphenmins. I believe, it is a significant improvement in pattern based hyphenation of the languages with arbitrary number of compounds. There is a related NLP library for spelling dictionary and spelling engine centralization: Enchant from the Abiword project, see http://www.abisource.com/projects/enchant/. I believe, extending its API and code with hyphenation is better for system-central storage, reducing the cost of the common tasks (portability issues, dependency problems, registration and listing of the available dictionaries, handling private dictionaries, character encoding issues etc.). It would be great, if you could help in this task. Also a good task to add hyphenation to the Mozilla code base and its applications. The basic requirements is the license (GPL/LGPL/MPL tri-license) and the CSS standard for hyphenation. (See http://www.w3.org/TR/css3-text/#hyphenate for the upcoming hyphenation support in CSS). Hyphen library would have a big advantage in the Mozilla integration, it is only 37 kB. (Small code size is crucial for Mozilla development.) Regards, László > > > If you find it workable, I'd love to try and test whether it works > > properly in the current OpenOffice environment. If not, however, > > Well, it's OpenOffice.org, but yes, you can try. If it's indeed a 1:1 > replacement and you can make sure it works as intended we can think of > trying it. But I don't see the need for a hurry for it now. > > > I'd like to point out that libhyphenate needs a Debian sponsor ;-). > > This should be doable :) > > Regards. > > Rene > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > > iD8DBQFH6B8m+FmQsCSK63MRAtgWAJ92QxqROym7qOdPD7hU6IpzHsYHGACdHVSk > i7ROmTPSUpXzwiLLF2jpejk= > =tktP > -----END PGP SIGNATURE----- > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
