Luca,
I have noticed this problem, and it is nice to see that you are taking
it on. I think it would be better to report this item in a patch of
its own. It really is a new issue.
Simon
On Wed, Apr 14, 2004 at 07:00:10PM -, [EMAIL PROTECTED] wrote:
> --- Additional Comments From [EMAIL PROTECTED] 2004-04-14 19:00 ---
> Yes, I was quite surprised to see that all the information stored in the
> BreakPoss was "thrown away" before adding areas; I chose to duplicate the
> needed values because this involved fewer and less radical changes.
>
> I have found another small bug concerning hyphenation in the
> HyphenationTree.hyphenate() method.
> Before checking the exception list or using the algorithm, the
> function "normalizes" the word: during this phase, if a non-letter character
> is found null is returned.
> // normalize word
> char[] c = new char[2];
> for (i = 1; i <= len; i++) {
> c[0] = w[offset + i - 1];
> int nc = classmap.find(c, 0);
> if (nc < 0) {// found a non-letter character, abort
> return null;
> }
> word[i] = (char)nc;
> }
> I think the condition (nc < 0) is too strong: at the moment words followed by
> punctuation marks, or in parenthesis, are not hyphenated.
>
> This is how I tried to fix this problem:
> - non-letter characters at the beginning are not copied into word[]
> - if a non-letter character is found which is not at the beginning, it is not
> copied into word[] and a boolean variable becomes true
> - if a letter-character is found when the variable is true, null is returned;
> otherwise, word[] is used to find hyphenation points
>
> I have also added a little optimization: if, after the normalization and the
> non-letter character removal, the word size is less than (remainCharCount +
> pushCharCount), null is returned, without checking the exception list and
> performing the algorithm.
>
> I'm going to attach the proposed patch and a test fo file which shows a few
> examples.
>
> Regards
>
> Luca
--
Simon Pepping
home page: http://www.leverkruid.nl