Jamis Buck <[EMAIL PROTECTED]> writes: > I know TeX has an algorithm for hyphenating words, and fop > (Formatting Objects to PDF, on the Apache site) uses the same > algorithm. Seeing how it's all open-source, mightn't it be possible > to port that algorithm to Python and use the same pattern files that > TeX and fop use? If that turns out to be possible, then you've > already got multiple languages taken care of, since I know fop, at > least, has a many different language hyphenation files.
TeX applies pattern matching to find candidates for hyphenation, and assigns each of them a badness value. this means that it will sometimes stretch the text slightly to get better hyphenation. short excerpt: "." means word beginning or word end. higher number means more desirable placement for hyphenation. % introduces my comments. .co3e % co-ercion .co4r % co-rduroy (this was strange, actually) .cor5ner % cor-nerstone .de4moi % de-moire you see that cornerstone could be hyphenated co-rnerstone if the amount of air in the layout is disrupted by cor-nerstone. there are rule sets like this for many languages. the English set consists of 4400, the Norwegian set is 1500. clearly not something to put in the viewer... Kjetil T.
