DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2004-05-29 08:56 --- Applied. Thanks. Simon
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-04-14 19:00 --- Yes, I was quite surprised to see that all the information stored in the BreakPoss was thrown away before adding areas; I chose to duplicate the needed values because this involved fewer and less radical changes. I have found another small bug concerning hyphenation in the HyphenationTree.hyphenate() method. Before checking the exception list or using the algorithm, the function normalizes the word: during this phase, if a non-letter character is found null is returned. // normalize word char[] c = new char[2]; for (i = 1; i = len; i++) { c[0] = w[offset + i - 1]; int nc = classmap.find(c, 0); if (nc 0) {// found a non-letter character, abort return null; } word[i] = (char)nc; } I think the condition (nc 0) is too strong: at the moment words followed by punctuation marks, or in parenthesis, are not hyphenated. This is how I tried to fix this problem: - non-letter characters at the beginning are not copied into word[] - if a non-letter character is found which is not at the beginning, it is not copied into word[] and a boolean variable becomes true - if a letter-character is found when the variable is true, null is returned; otherwise, word[] is used to find hyphenation points I have also added a little optimization: if, after the normalization and the non-letter character removal, the word size is less than (remainCharCount + pushCharCount), null is returned, without checking the exception list and performing the algorithm. I'm going to attach the proposed patch and a test fo file which shows a few examples. Regards Luca
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-04-14 19:03 --- Created an attachment (id=11238) test fo file: words with punctuation marks and parenthesis
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-04-09 09:28 --- Hi Luca, Your patch looks good. It works fine on your test fo, and it also works fine on a test fo I had lying around. There are a few things to note: AreaInfo.bHyphenated is identical to (flags BreakPoss.HYPHENATED); ai.ipdArea = MinOptMax.add(ai.ipdArea, new MinOptMax(hyphIPD)) is equivalent to bp.setStackingSize(MinOptMax.add(ipd, new MinOptMax(hyphIPD))). In other words, the same information is present in the BreakPoss, which is held by the LineLM. Unfortunately, this information is not accessible in the addAreas method of TextLM. This is so because posIter.next() returns the position member of the BreakPoss, not the BreakPoss itself. I believe that is an error in PositionIterator which eventually needs correction. For now your solution is a good one. Your proposed TextInfo.hypChar is also known as LineLM.hyphProps.hyphenationChar. This information is not passed on to childLMs and thus not known to TextLM. Storing it in TextInfo seems a good idea; perhaps it is not even needed in hyphProps. It is not completely clear that +if (ai.bHyphenated) { +str += foText.textInfo.hyphChar; +ai.ipdArea = MinOptMax.add(ai.ipdArea, new MinOptMax(hyphIPD)); +} is always correct. This happens on the last AI returned by posIter.next(), which is not necessarily the last one in the line. But it is hard to conceive of an AI which is last in its LM and also hyphenated; therefore it will work. Ideally, the LineLM indicates which BP is the one ending the line; perhaps this needs correction at some time. Regards, Simon
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-04-05 09:22 --- Hi all I apologize for having posted some proposed changes without a line explaining what I was changing and why. Really sorry, I'm going to try and do it now. First of all, the condition now tested in the LineLayoutManager.getNextBreakPoss() to decide whether or not to try hyphenation is: if (bTextAlignment == TextAlign.JUSTIFY || prevBP == null) { ... } So hyphenation is tried for *every* justified block of text; only if the language attribute is not set the getHyphenContext() method returns null and hyphenation is not actually done. My suggestion is to change it so that the fo property hyphenate (7.9.4 in the recommendation, and already implemented in the CommonHyphenation class) is tested instead: if (hyphProps.hyphenate == Constants.TRUE) { ... } In the TextInfo class there is a boolean attribute called bCanHyphenate, but it is given a default value of true, and it is no more modified. It seems to me that it is quite useless. Is is used in the TextLayoutManager.getHyphenIPD() method: if (textArray.length iStopIndex || foText.textInfo.bCanHyphenate == false) { ... } but this method is called only if context.tryHyphenate() is true; with my suggested change, context.tryHyphenate() returns true only if the fo property hyphenate is true, and so it is no more necessary to test foText.textInfo.bCanHyphenate. The second problem is that the hyphenation character is not shown: to fix this I added in the AreaInfo class (which stores information about the area that will be generated) a boolean attribute bHyphenated. Is is set according to the value of the flags of the corresponding BreakPoss, and it is tested in the TextLayoutManager.addAreas() method to decide whether to add the hyphen. Last (and least): I tried to implement the fo property hyphenation-character (7.9.5 in the recommendation). As it applies to blocks, I thought to store it as a character variable in the TextInfo class, with initial value -. It is set in the PropertyManager.getTextLayoutProps() method, together with the other TextInfo attribute. Is is used to calculate the hyphIPD in the TextLayoutManager constructor, and in the addAreas method. Bye Luca
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-04-05 16:40 --- Sorry, I'm here again (hope I am not making too much a mess!) :-) I'm going to attach an updated proposed patch, as I noticed there was an error in the one I posted before. In the TextLayoutManager.addAreas() method, the hyphen size must be added to the total inline progression dimension of the line *before* the adjustment calculation; this involves also moving a few lines of code. Bye Luca
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-04-05 16:41 --- Created an attachment (id=11138) updated diff
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-03-19 11:38 --- Created an attachment (id=10868) patch created using cvs diff
DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27773. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27773 [PATCH] Hyphenation --- Additional Comments From [EMAIL PROTECTED] 2004-03-19 11:39 --- Created an attachment (id=10869) FO sample file to test the behavior of fop before and after the patch