DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-05-29 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED



--- Additional Comments From [EMAIL PROTECTED]  2004-05-29 08:56 ---
Applied. Thanks.

Simon


Re: [Bug 27773] - [PATCH] Hyphenation

2004-04-16 Thread Luca Furini

 I think it would be better to report this item in a patch of
 its own. It really is a new issue.

Ok, sorry.
I'm going to do as you suggest.

Luca




Re: [Bug 27773] - [PATCH] Hyphenation

2004-04-15 Thread Simon Pepping
Luca,

I have noticed this problem, and it is nice to see that you are taking
it on. I think it would be better to report this item in a patch of
its own. It really is a new issue.

Simon

On Wed, Apr 14, 2004 at 07:00:10PM -, [EMAIL PROTECTED] wrote:
 --- Additional Comments From [EMAIL PROTECTED]  2004-04-14 19:00 ---
 Yes, I was quite surprised to see that all the information stored in the 
 BreakPoss was thrown away before adding areas; I chose to duplicate the 
 needed values because this involved fewer and less radical changes.
 
 I have found another small bug concerning hyphenation in the 
 HyphenationTree.hyphenate() method.
 Before checking the exception list or using the algorithm, the 
 function normalizes the word: during this phase, if a non-letter character 
 is found null is returned.
 // normalize word
 char[] c = new char[2];
 for (i = 1; i = len; i++) {
 c[0] = w[offset + i - 1];
 int nc = classmap.find(c, 0);
 if (nc  0) {// found a non-letter character, abort
 return null;
 }
 word[i] = (char)nc;
 }
 I think the condition (nc  0) is too strong: at the moment words followed by 
 punctuation marks, or in parenthesis, are not hyphenated.
 
 This is how I tried to fix this problem:
 - non-letter characters at the beginning are not copied into word[]
 - if a non-letter character is found which is not at the beginning, it is not 
 copied into word[] and a boolean variable becomes true
 - if a letter-character is found when the variable is true, null is returned; 
 otherwise, word[] is used to find hyphenation points
 
 I have also added a little optimization: if, after the normalization and the 
 non-letter character removal, the word size is less than (remainCharCount + 
 pushCharCount), null is returned, without checking the exception list and 
 performing the algorithm.
 
 I'm going to attach the proposed patch and a test fo file which shows a few 
 examples.
 
 Regards
 
 Luca

-- 
Simon Pepping
home page: http://www.leverkruid.nl



DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-04-14 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-04-14 19:00 ---
Yes, I was quite surprised to see that all the information stored in the 
BreakPoss was thrown away before adding areas; I chose to duplicate the 
needed values because this involved fewer and less radical changes.

I have found another small bug concerning hyphenation in the 
HyphenationTree.hyphenate() method.
Before checking the exception list or using the algorithm, the 
function normalizes the word: during this phase, if a non-letter character 
is found null is returned.
// normalize word
char[] c = new char[2];
for (i = 1; i = len; i++) {
c[0] = w[offset + i - 1];
int nc = classmap.find(c, 0);
if (nc  0) {// found a non-letter character, abort
return null;
}
word[i] = (char)nc;
}
I think the condition (nc  0) is too strong: at the moment words followed by 
punctuation marks, or in parenthesis, are not hyphenated.

This is how I tried to fix this problem:
- non-letter characters at the beginning are not copied into word[]
- if a non-letter character is found which is not at the beginning, it is not 
copied into word[] and a boolean variable becomes true
- if a letter-character is found when the variable is true, null is returned; 
otherwise, word[] is used to find hyphenation points

I have also added a little optimization: if, after the normalization and the 
non-letter character removal, the word size is less than (remainCharCount + 
pushCharCount), null is returned, without checking the exception list and 
performing the algorithm.

I'm going to attach the proposed patch and a test fo file which shows a few 
examples.

Regards

Luca


DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-04-14 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-04-14 19:03 ---
Created an attachment (id=11238)
test fo file: words with punctuation marks and parenthesis


DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-04-09 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-04-09 09:28 ---
Hi Luca,

Your patch looks good. It works fine on your test fo, and it also
works fine on a test fo I had lying around.

There are a few things to note:

AreaInfo.bHyphenated is identical to (flags  BreakPoss.HYPHENATED);
ai.ipdArea = MinOptMax.add(ai.ipdArea, new MinOptMax(hyphIPD)) is
equivalent to bp.setStackingSize(MinOptMax.add(ipd, new
MinOptMax(hyphIPD))). In other words, the same information is present
in the BreakPoss, which is held by the LineLM. Unfortunately, this
information is not accessible in the addAreas method of TextLM. This
is so because posIter.next() returns the position member of the
BreakPoss, not the BreakPoss itself. I believe that is an error in
PositionIterator which eventually needs correction. For now your
solution is a good one.

Your proposed TextInfo.hypChar is also known as
LineLM.hyphProps.hyphenationChar. This information is not passed on to
childLMs and thus not known to TextLM. Storing it in TextInfo seems a
good idea; perhaps it is not even needed in hyphProps.

It is not completely clear that 

+if (ai.bHyphenated) {
+str += foText.textInfo.hyphChar;
+ai.ipdArea = MinOptMax.add(ai.ipdArea, new MinOptMax(hyphIPD));
+}

is always correct. This happens on the last AI returned by
posIter.next(), which is not necessarily the last one in the line. But
it is hard to conceive of an AI which is last in its LM and also
hyphenated; therefore it will work. Ideally, the LineLM indicates
which BP is the one ending the line; perhaps this needs correction at
some time.

Regards, Simon


DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-04-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-04-05 09:22 ---
Hi all

I apologize for having posted some proposed changes without a line explaining 
what I was changing and why.
Really sorry, I'm going to try and do it now.

First of all, the condition now tested in the 
LineLayoutManager.getNextBreakPoss() to decide whether or not to try 
hyphenation is:
  if (bTextAlignment == TextAlign.JUSTIFY || prevBP == null) {
  ...
  }
So hyphenation is tried for *every* justified block of text; only if 
the language attribute is not set the getHyphenContext() method returns null 
and hyphenation is not actually done.
My suggestion is to change it so that the fo property hyphenate (7.9.4 in the 
recommendation, and already implemented in the CommonHyphenation class) is 
tested instead:
  if (hyphProps.hyphenate == Constants.TRUE) {
  ...
  }
In the TextInfo class there is a boolean attribute called bCanHyphenate, but 
it is given a default value of true, and it is no more modified. It seems to 
me that it is quite useless.
Is is used in the TextLayoutManager.getHyphenIPD() method:
  if (textArray.length  iStopIndex || foText.textInfo.bCanHyphenate == false) 
{
  ...
  }
but this method is called only if context.tryHyphenate() is true; with my 
suggested change, context.tryHyphenate() returns true only if the fo property 
hyphenate is true, and so it is no more necessary to test 
foText.textInfo.bCanHyphenate.

The second problem is that the hyphenation character is not shown: to fix this 
I added in the AreaInfo class (which stores information about the area that 
will be generated) a boolean attribute bHyphenated.
Is is set according to the value of the flags of the corresponding BreakPoss, 
and it is tested in the TextLayoutManager.addAreas() method to decide whether 
to add the hyphen.

Last (and least): I tried to implement the fo property hyphenation-character 
(7.9.5 in the recommendation).
As it applies to blocks, I thought to store it as a character variable in the 
TextInfo class, with initial value -.
It is set in the PropertyManager.getTextLayoutProps() method, together with 
the other TextInfo attribute.
Is is used to calculate the hyphIPD in the TextLayoutManager constructor, and 
in the addAreas method.

Bye
Luca


DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-04-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-04-05 16:40 ---
Sorry, I'm here again (hope I am not making too much a mess!) :-)

I'm going to attach an updated proposed patch, as I noticed there was an error 
in the one I posted before.
In the TextLayoutManager.addAreas() method, the hyphen size must be added to 
the total inline progression dimension of the line *before* the adjustment 
calculation; this involves also moving a few lines of code.

Bye
Luca


DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-04-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-04-05 16:41 ---
Created an attachment (id=11138)
updated diff


DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-03-19 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-03-19 11:38 ---
Created an attachment (id=10868)
patch created using cvs diff


DO NOT REPLY [Bug 27773] - [PATCH] Hyphenation

2004-03-19 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=27773.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=27773

[PATCH] Hyphenation





--- Additional Comments From [EMAIL PROTECTED]  2004-03-19 11:39 ---
Created an attachment (id=10869)
FO sample file to test the behavior of fop before and after the patch