Since some weeks now, I am preparing for creating new hyphentation patterns for Dutch, includieng the feature of changing the word while hyphenating.

Normally, one generates patterns from a hyphenation file, which is in the rather simple format of hyphenated words (ex-am-ple).

This format is clearly not good enough to show all hypehnation patterns for these changing words. I'll use Dutch examples partly from now, though it applies to German and Greek (at least) as well.

For an input dictionary, I see 2 alternatives:
1) just list all possible hyphenations:
ex=ample, exam=ple
omaatje, oma=tje
ruïne, ru=ine
tv-=special,
2) Make special notes for the changes, signalling the hyphenations with special chars like brackets containging the optional alteration
ex[]am[]ple
oma[a=]tje   (remove 1 a when hyphenating)
ru[ï=i]ne    (change "i inti i when hyphenating)
tv-[-=]special (remove the - (another one will be inserted by hyphenating process)

For the first, and most common hyphenation, a shorthand could be introduced by any char, saving 1 char per word (Is that worth it?) The chars for the brackets and hyphenation could be 'declared' in the file header, leading to a format like:

[]   #hyphenation area
=   #hypehnation char
ru[ï=i]ne   #example comment

* Would more languages then Dutch have use for a format like this?
* Would it be feasible to base a pattern generator on this format

* What are the general thought about trying to set a standard for hyphenation registration ?

Please feel free to comment on this.


Ruud Baars







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to