On Jul 9, 2007, at 22:30, J.Pietschmann wrote:
a_l.delmelle wrote in a bugzilla entry:
Hyphenation is, in fact, only applicable to pure alphabetical
Well, no. The pattern based hyphenator can deal with any Unicode
characters (apart from digits, whitespace and the dot, which have
a special meaning in the pattern definitions). If the word parser
would use the character classes from the active pattern file for
parsing words, basically anything could be used. This would only
need a proper interface for retrieving the character classes. The
class canonicalization could even be folded into the parsing process
for better performance.
OK, I see the possibilities. The fact that digits have this special
meaning in the patterns does have its reasons, though.
I have yet to encounter a text in which anything was hyphenated but
words. Dates or timestamps? Digits? Serial numbers? E-mail addresses?
Never seen any of those hyphenated. Wrapped, sometimes, but never
Looked around a bit, and combining 'hyphenation' and 'numbers' only
got me in the direction of hyphenation /of/ numbers when spelled out
completely --as words.
So what I meant by that statement was: Hyphenation makes sense only
in the context of written text, as in relation to a dictionary.
Seems to me the reporter is wrong to expect that sequence of 80+
digits to be hyphenated under any circumstance, and even the comma-
case... Easy enough to come up with such oddities, but when would you
ever really need that? And more importantly: Is it really hyphenation
you would need then?