Luca, I have noticed this problem, and it is nice to see that you are taking it on. I think it would be better to report this item in a patch of its own. It really is a new issue.
Simon On Wed, Apr 14, 2004 at 07:00:10PM -0000, [EMAIL PROTECTED] wrote: > ------- Additional Comments From [EMAIL PROTECTED] 2004-04-14 19:00 ------- > Yes, I was quite surprised to see that all the information stored in the > BreakPoss was "thrown away" before adding areas; I chose to duplicate the > needed values because this involved fewer and less radical changes. > > I have found another small bug concerning hyphenation in the > HyphenationTree.hyphenate() method. > Before checking the exception list or using the algorithm, the > function "normalizes" the word: during this phase, if a non-letter character > is found null is returned. > // normalize word > char[] c = new char[2]; > for (i = 1; i <= len; i++) { > c[0] = w[offset + i - 1]; > int nc = classmap.find(c, 0); > if (nc < 0) { // found a non-letter character, abort > return null; > } > word[i] = (char)nc; > } > I think the condition (nc < 0) is too strong: at the moment words followed by > punctuation marks, or in parenthesis, are not hyphenated. > > This is how I tried to fix this problem: > - non-letter characters at the beginning are not copied into word[] > - if a non-letter character is found which is not at the beginning, it is not > copied into word[] and a boolean variable becomes true > - if a letter-character is found when the variable is true, null is returned; > otherwise, word[] is used to find hyphenation points > > I have also added a little optimization: if, after the normalization and the > non-letter character removal, the word size is less than (remainCharCount + > pushCharCount), null is returned, without checking the exception list and > performing the algorithm. > > I'm going to attach the proposed patch and a test fo file which shows a few > examples. > > Regards > > Luca -- Simon Pepping home page: http://www.leverkruid.nl