Re: DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-09-13 Thread Luca Furini
Simon Pepping wrote:

 Still to be done:

 - Resolve the regressions mentioned above.

As concerns leader with use content, patch created and successfully tested.
The ContentLM calls getNextKnuthElements on his child InlineStackingLM, uses
the returned elements to calculate the pattern width and returns them to the
LeaderLM. The LeaderLM uses them when calling addAreas.

I also found a bug affecting leaders with leader-pattern = dots: the
TextArea with the dot (created in LeaderLM.getLeaderInlineArea) had width =
0; calling setWidth() fixes this problem.
There is still a little difference between a leader with leader-pattern =
dots and one with use-content and a single dot as content: the former is
placed a bit over the baseline, but I couldn't find the reason.

Note that using the fo file xml-fop/examples/fo/basic/leader.fo to test the
patch you won't see the leaders with leader-pattern = use-content, as they
don't have a width property and the default .opt value (12pt) is  than the
pattern width. Setting a larger width, or text-align-last = justify, makes
the leaders visible.

 - I support the idea to create an InlineLayoutManager interface, which
   extends LayoutManager.

Done, same patch (or maybe I should create a different one?). I also
removed the getWordSpaceIPD() method, as I find out that a constant value
works better: the LineLM and its child must use the same value, or the
result is not always correct.

 1.
 Can we be sure that U+A is always alone or the first item in a
 textArray; does this not depend on the Parser, how it calls the SAX
 characters method?

Right, it's better to handle the most general case.
The patch will fix this too.

I will try to fix the other points reported by Simon as soon as possible.

Regards
Luca





DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-09-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED



--- Additional Comments From [EMAIL PROTECTED]  2004-09-05 18:21 ---
Luca,

Patch applied. Thanks for this innovative and extensive contribution.

Still to be done:

- Resolve the regressions mentioned above.
- I support the idea to create an InlineLayoutManager interface, which
  extends LayoutManager.
- This patch has made a lot of existing code redundant. Much of that
  code is still present. To keep the code clean and intelligible, the
  redundant pieces should be removed at some time by somebody.

I have added a space after casts. See the style guidelines in the file
dev/conventions.html of the FOP web site.

I have a few remarks about the code. I leave it to you to follow these
up or not, but I would like to see point 1 addressed:

1. In TextLM:
// linefeed; this can happen when linefeed-treatment=preserve
// the linefeed character is the first one in textArray,
// so we can just return a list with a penalty item

In LineLM:
if (returnedList.size()  1
|| !(thisElement.isPenalty()  ((KnuthPenalty)thisElement).getP() ==
-KnuthElement.INFINITE)) {
} else {
// a list with a single penalty item whose value is -inf
// represents a preserved linefeed, wich forces a line break

Can we be sure that U+A is always alone or the first item in a
textArray; does this not depend on the Parser, how it calls the SAX
characters method?

2. In InlineStackingLM.applyChanges: Falling over the end of
oldListIterator can be done better: treat only currLM != prevLM in the
loop, treat !oldListIterator.hasNext() after the loop. Same for
getChangedKnuthElements? :

while(oldListIterator.hasNext()) {
oldElement = (KnuthElement)oldListIterator.next();
currLM = oldElement.getLayoutManager();
// initialize prevLM
if (prevLM == null) {
prevLM = currLM;
}

if (currLM != prevLM) {
bSomethingChanged = prevLM.applyChanges(oldList.subList(fromIndex,
oldListIterator.previousIndex()))
|| bSomethingChanged;
prevLM = currLM;
fromIndex = oldListIterator.previousIndex();
}
}
bSomethingChanged = currLM.applyChanges(oldList.subList(fromIndex, oldList.size()))
|| bSomethingChanged;

Possible cases, after the loop:
xxyy or yy, xx done
prevLM = currLM = y
fromIndex = last done (2 and 0)

3. In InlineStackingLM: Unnecessary differences between treatment of
returnedList and returnList in getNextKnuthElements and
getChangedKnuthElements. In getChangedKnuthElements it is not
necessary to have a separate returnedList and returnList.

4. Break up long methods in LineLM: findHyphenationPoints,
getNextBreakPoss, considerLegalBreak (?), findBreakingPoints (?).

Regards, Simon


[Fwd: DO NOT REPLY [Bug 29124] - New line breaking algorithm]

2004-09-01 Thread Chris Bowditch
--- Additional Comments From [EMAIL PROTECTED]  2004-08-31 18:44 ---
Thanks for the new patch. I could apply it without problems, and
testing it goes well.
You mention that you have not implemented the Knuth algorithm for
ContentLM. Would it be difficult to do that?
FOP team,
If I would apply this patch, we would get the following regressions:
- ContentLM does not show its content. A leader with
  leader-pattern=use-content results in a blank area of the right
  size.
Doesnt sound like it will be difficult to fix after the patch is applied.
- When for an exceptionally difficult paragraph no set of breaking
  points can be found, the whole paragraph is printed on a single
  line. This occurs, for example, when in a narrow typesetting width
  only a single word or a part of it fits in a line.
I would think that strange effects like this are possible today. Can you see 
what the output would look like in such a scenario with the current code?

I am working towards applying this patch despite these
regressions, for these reasons:
- This patch is a good piece of work, and a step forward for FOP's
  layout.
Agreed.
- It becomes increasingly hard to maintain this patch outside of CVS.
I know how you feel. I found it hard work before when I examined Luca's 
earlier Knuth patch.

Chris




DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-09-01 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-09-02 04:23 ---
Q
I am working towards applying this patch despite these
regressions, for these reasons:

- This patch is a good piece of work, and a step forward for FOP's
  layout.

- It becomes increasingly hard to maintain this patch outside of CVS.

Please, speak up if you are against this.
/Q

Simon, I have not had the time to be following this issue much so will be
deferring to your judgment.

Thanks,
Glen


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-31 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-31 18:44 ---
Luca,

Thanks for the new patch. I could apply it without problems, and
testing it goes well.

You mention that you have not implemented the Knuth algorithm for
ContentLM. Would it be difficult to do that?

FOP team,

If I would apply this patch, we would get the following regressions:

- ContentLM does not show its content. A leader with
  leader-pattern=use-content results in a blank area of the right
  size.

- When for an exceptionally difficult paragraph no set of breaking
  points can be found, the whole paragraph is printed on a single
  line. This occurs, for example, when in a narrow typesetting width
  only a single word or a part of it fits in a line.

I am working towards applying this patch despite these
regressions, for these reasons:

- This patch is a good piece of work, and a step forward for FOP's
  layout.

- It becomes increasingly hard to maintain this patch outside of CVS.

Please, speak up if you are against this.

Simon


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-28 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-28 10:06 ---
Sorry again, I didn't notice that cvs error.

I'm going to attach the right (?) patch :-)
(for some strange reasons wincvs' diff shows Simon's latest changes too, I 
hope this is not a problem ...)

Luca


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-28 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-28 10:09 ---
Created an attachment (id=12560)
patch to existing files (version 7.1)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-27 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-27 19:29 ---
Luca,

The build failed. That is probably due to this line in the patch:

cvs server: I know nothing about layoutmgr/LeafNodeLaoyutManager.java

Simon


Re: DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-24 Thread Luca Furini

Simon Pepping wrote:

 Nested inline and other LMs: The output contains errors, see the
 comments in the text. The errors occur when hyphenation is set to
 true.

Fixed: there were errors in the method addALetterSpaceTo of LeafNodeLM and
InlineStackingLM.

I also found a bug in the LeafNodeLM.addAreas method, affecting HEAD too:
the area is added to the area tree (with parentLM.addChild(curArea))
*before* widthAdjustArea is called, so its width is not correctly added to
the inline parent width and the output sometimes shows overlapped text
(when there is another child of inline parent after the leader area).

 Justification: This is a test fo you submitted earlier. According to
 the text in the file the second block should be hyphenated; it is
 not. Should it still be hyphenated, or can this not be enforced with
 the Knuth algorithm and text-align=start?

I cannot find a hyphenate property in the fo file you attached, so I'm not
sure whether I understand what you mean.
Anyway, hyphenate = true means, according to the recommendation (7.9.4),
that hyphenation may be used in the line-breaking algorithm, not that it
*must* be used.
As hyphenation is time-expansive and bad-looking, I think it should be
used only if necessary.

 No breakpoints: An exception is thrown, at
 LineLayoutManager.getNextBreakPoss(LineLayoutManager.java:495). It
 occurs because breakpoints has size 0; the third call to
 findBreakingPoints also returned 0. This should not be possible; the
 algorithm should always return a breakpoint.

Right, I completely forgot to provide a fallback in case the algorithm
doesn't find a good set of breaking points.
I added a boolean argument called force to findBreakingPoints: if it is
true, and after the main loop there are no active nodes, the last
deactivated node is used to create LineBreakPositions.
There will zero or more good lines followed by a single line including
all the remaining content (this line will obviously get off the right
margin).

The method findBreakingPoints will be called no more than three times:
I) no hyphenation, adjustment ratios must be = 1
II) hyphenation (if allowed), or ratios up to 5
III) ratios up to 20, and if necessary force the creation of LineBreakPositions

 A few small remarks:

 Can you move the following log messages to trace log level:
 [DEBUG] AbstractLayoutManager - - Word to hyphenate: We

Done

 In TextLM, returning null for a forced LF is not an idea that I like,
 because it overloads the null return value. Cannot you return an
 special Knuth element for LF? Alternatively, you could return null and
 process the paragraph. The second paragraph would then be produced and
 processed later.

A preserved linefeed can be represented by a penalty item whose value is
-infinite: +inf means that there can't be a break here, -inf means that
there must be a break (as there can't be a better breakpoint).

Preserved linefeeds inside inlines are much more problematical than I
first thought, but they should work now: I had to add a List argument to
the applyChanges() and getChangedKnuthElements() methods, to tell an ISLM
which children it has to consider.

 InlineStackingLM.getNextKnuthElements: 'if (lc.startsNewArea())' no
 longer used?

I tried to preserve the existing code as much as possible, so I didn't
touch that if statement.
Maybe I removed some lines in the LineLM so that lc.startsNewArea is never
true?

Regards,
Luca




DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-24 15:24 ---
I'm going to attach still another version of the patch :-), corrected 
according to Simon's comments.

The new files (Knuth*.java) are unchanged, so I don't attach them.

Regards, 
Luca


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-24 15:25 ---
Created an attachment (id=12518)
patch to existing files (sixth edition)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-19 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-19 16:36 ---
Created an attachment (id=12488)
patch - existing files (fifth edition)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-19 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-19 19:28 ---
The last patches do not apply without errors:
- Last lines of patch files do not end with an newline.
- Diff of area/inline/TextArea.java is incomplete.
- Compile error in render/xml/XMLRenderer.java.
I did manage to fix all problems.


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-19 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-19 19:32 ---
Created an attachment (id=12492)
test fo: Nested inline and other LMs


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-19 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-19 19:32 ---
Created an attachment (id=12493)
test fo: Justification


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-19 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-19 19:33 ---
Created an attachment (id=12494)
test fo: No breakpoints


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-17 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-17 13:51 ---
Oops, sorry again, Simon!

In the code I used when creating the last patch there was an error affecting 
the TextLayoutManager.getChangedKnuthElements() method: two missing break 
inside a switch.
Due to this error, the sequence of elements generated for each space (when 
text-align is center, start or end) is wrong, and some text disappears (I 
even got IndexOutOfBounds exceptions).
Inserting these breaks is enough to make everything work:

...
// ai refers to a space
switch (alignment) {
case CENTER : ...
  iReturnedIndex ++;
  break; /* this was missing */
case START  : // fall through
case END: ...
  iReturnedIndex ++;
  break; /* this was missing */
case JUSTIFY: ...
}
...

As you can see, in the last patch I changed the getNextKnuthElements() and 
getChangedKnuthElements() return type, so they now return a sequence of 
elements instead of a single one.
This maybe reduces similarities between getNextKnuthElements() and 
getNextBreakPoss(), but I think it makes the code simpler and easier to 
understand. Maybe it would be even better to make them return the whole 
sequence, so that these methods are called once per LM.

Now I'm working on the newly-created LMs, so next patch (which I think will be 
ready tomorrow) will apply to the latest code version and will include Finn 
Bock's changes.

Regards,
Luca


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-16 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-16 19:13 ---
Hi Luca,

I have problems with the results for text-align=start, end and
center in your test FO file. The lines are too long, and the first
line ends with a space. For start and end a paragraph is
dropped. Herewith the area tree output for start/end and center:

  block width=288000 ipd=288000 height=30800
props=border-start:(87,#00,1000);break-after:8;border-end:(87,#00,1000);border-after:(87,#00,1000);border-before:(87,#00,1000);break-before:8;
lineArea height=14400
  text twsadjust=0 tlsadjust=0
props=font-size:12000;font-family:F1;color:#00;Poche corte parole molto
corte, in modo che tutto vada bene e non ci siano guai. Qualche /text
/lineArea
lineArea height=14400
  text twsadjust=0 tlsadjust=0
props=font-size:12000;font-family:F1;color:#00;tra parola per fare tre
righe./text
/lineArea
  /block

  block width=288000 ipd=288000 height=30800
props=border-start:(87,#ff,1000);break-after:8;border-end:(87,#ff,1000);border-after:(87,#ff,1000);border-before:(87,#ff,1000);break-before:8;
lineArea height=14400
  text twsadjust=0 tlsadjust=0
props=font-size:12000;font-family:F1;color:#00;Poche corte parole molto
corte, in modo che tutto vada bene e non /text
/lineArea
lineArea height=14400
  text twsadjust=0 tlsadjust=0
props=font-size:12000;font-family:F1;color:#00;ci siano guai. Qualche
altra paro/text
  text twsadjust=0 tlsadjust=0
props=font-size:12000;font-family:F1;color:#00;la per fare tre righe./text
/lineArea
  /block

Regards, Simon


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-12 08:23 ---
I'm going to attach an updated patch, including HyphContext (which I forgot to 
include in the previous versions, sorry) and a few changes to fix a couple of 
bugs.

I used linux's diff between the modified files and the original ones (updated 
yesterday, 11 August); for some reasons (maybe I use some wrong options) 
wincvs's diff did not include new files and did not use the latest version of 
the original files, so finding lots of difference due to recent cvs commits.

I hope I did not forget anything this time! :-)

Luca


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-12 08:25 ---
Created an attachment (id=12400)
patch file (fourth edition, including HyphContext and bug fixes)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-10 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-10 19:28 ---
Luca,

Your patch applied cleanly to a checkout from CVS dated 1 August 2004,
but I get build errors. Could it be that differences to HyphContext
are not included in the patch?

[javac] src/java/org/apache/fop/layoutmgr/LineLayoutManager.java:1081:
cannot resolve symbol
[javac] symbol  : constructor HyphContext (int[],int)
[javac] location: class org.apache.fop.layoutmgr.HyphContext
[javac] return new HyphContext(hyph.getHyphenationPoints(),
sbChars.length());
[javac]^
[javac] src/java/org/apache/fop/layoutmgr/TextLayoutManager.java:425: cannot
resolve symbol
[javac] symbol  : method isWordEnd ()
[javac] location: class org.apache.fop.layoutmgr.HyphContext
[javac] newIPD.add(MinOptMax.multiply(letterSpaceIPD,
(hc.isWordEnd()?
[javac]  ^
[javac] src/java/org/apache/fop/layoutmgr/TextLayoutManager.java:437: cannot
resolve symbol
[javac] symbol  : method isWordEnd ()
[javac] location: class org.apache.fop.layoutmgr.HyphContext
[javac]  
(short)0, (short)(hc.isWordEnd()? (iStopIndex - iStartIndex - 1): (iStopIndex -
iStartIndex)),
[javac]
  ^
[javac] 3 errors

Simon


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-05 16:09 ---
I'm going to attach the corrected patch, sorry again!

Knuth's algorithm is described in the essay
  D. E. Knuth and M. F. Plass, Breaking paragraphs into lines
and I found it in the book
  D. E. Knuth, Digital typography, published by CSLI Publications
Unfortunately, I couldn't find any link to an on-line version of this essay.

As regards the names of the classes, they were mainly devised to detect 
quickly the new files among the others! So, it's not a problem for me to 
change them :-)

Luca


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-05 16:11 ---
Created an attachment (id=12345)
patch file (third edition, minor oversight fixed)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-03 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-03 12:41 ---

Hi all

At long last, I have finished the patch implementing Knuth's line breaking 
algorithm; it took me more than I expected, mainly because of a long sequence 
of hw and sw troubles ... Murphy's laws are not something to laugh at! :-)

I have worked on [Line, Text, InlineStacking, LeafNode]LM, so the algorithm 
should work well with any fo file containing text, leaders, characters, 
inlines and the other formatting objects handled by a LeafNodeLM (external 
graphics, pagenumbers and citations).

The general idea of Knuth algorithm is:

  try to find breaking points without hyphenating words
  if this fails
hyphenate all words
try again

The hyphenate all words phase could be time-expansive, so this step is 
performed trying to use as much as possible the information already known, and 
to minimize the changes to the existing sequence of elements.
The old sequence is used to collect word fragments, and elements are replaced 
only if the LM which created them has something to change.
So, hyphenate all words means:

  scan the old sequence once:
collect word fragments
hyphenate word
  scan the old sequence once more:
if the LM which returned this element has changed something
  replace all elements returned by this LM
 
These are the new methods added; at the moment, I added them to the 
LayoutManager interface, but maybe it could be better to create a new 
interface implementd only by LM returning inline areas.

+ getNextKnuthElement()
This is used instead of getNextBreakPoss().
The next step (I have already started working on it) would be to use the same 
method all LM use.

+ addALetterSpaceTo()
The low-level LMs (TLMs, LNLMs) have only a partial view of the text, and 
therefore cannot know the exact number of letter spaces, while the LineLM has 
a full view.
If a TLM's text is Tex, it can only suppose it has 2 letter spaces; if the 
following formatting object is a character t, the LineLM tells the TLM to 
add a letter space, as the x is not the last letter of the word.

+ getWordChars()
This is not a new method, it just has different parameters; text is collected 
from fo:characters too.

+ hyphenate()
The TLM does not apply the changes to vecAreaInfo immediately, otherwise the 
existing Position objects stored in the old sequence couldn't be used any 
more. The LeafNodeLM returns a single area, so it can apply changes immediatly.

+ applyChanges()
This method tells the TLM to apply the changes to vecAreaInfo; all LM returns 
true if something is changed or false otherwise, so the LLM knows whether it 
has to replace the old elements or not.

+ getChangedKnuthElement()
This is used by the LLM to obtain the new elements.

+ getWordSpaceIPD()
This is used by the LLM to ask for the word space dimension; the LLM needs it 
to center text.

A few details to fix:

- word spacing and letter spacing are now fully implemented, they can both 
have MinOptMax values; but I am still thinking about how to differentiate a 
user-defined zero value from a default zero value ...

- Leaders with leader-pattern = rule or space work well; with dots the 
space left is right, but the dots don't fill it properly. Leaders with leader-
pattern=use-content don't work, as the ContentLayoutManager has at the 
moment only a null implementation of the method getNextKnuthElement.
There is also a minor bug concerning (IMO) white space handling: if there 
white space both before and after the leader, the latter one is removed, so 
instead of
  word __ word
the output shows
  word __word

- with the other fo elements (fo:externalgraphic, fo:page-number and fo:page-
number-citation) the LeafNodeLM behave exactly the same way as with the old 
code, i.e. a fo:page-number-citation generates a ?  .

- text-align-last is partially implemented; text-align-last = justify works 
only if text-align = justify too; this is because Knuth's algorithm doesn't 
provide for a different alignment for the last line.

I'm going to attach:
- the patch to existing files and new files
- a test fo file


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-03 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-03 12:45 ---
Created an attachment (id=12308)
patch (second edition)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-03 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-03 12:45 ---
Created an attachment (id=12309)
test fo file


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-03 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-03 13:15 ---
I just realized that in the patch there is a line commented out, while it 
shouldn't be. 

It's in the LineLayoutManager, in the method getNextBreakPoss();
the code at the moment is:

  ...
  if (true) {
  //if ((iBPcount = findBreakingPoints(currPar, context.getStackLimit().opt, 
maxAdjustment)) == 0) {
  ...

The commented line should be uncommented, and the if (true) { removed.
Anyway, the program works all the same.

Sorry for the oversight!
Luca


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-03 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-03 13:32 ---
Thanks, you will need to add the comments in this patch to the *code*, people 
will not always have the benefit of this Bugzilla entry when looking at it.  
This is extremely important, as layout is very complex.

Also, is this Knuth algorithm copyrighted?  Where did you get it from?  It is 
rare that we have classes named after authors.

Glen


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-03 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-03 14:42 ---
Thanks again for you work--we have very few who can work on layout.

Apparently, Dr. Knuth wouldn't seem to mind using his algorithms:
http://lpf.ai.mit.edu/Patents/knuth-to-pto.txt


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-08-03 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-08-03 15:00 ---
Luca, I also thank you for your time and commitment. Grazie mille^H^H^H^H^H un 
miliardo! I'm sure it 
is very much appreciated by the other COMMITTERs, as well as the throngs who will 
benefit from your 
time and energy in the future. This is a very exciting addition to FOP, and I'm hoping 
it will help to 
simplify the code in other ways as well. It's really nice to have a multitude of 
people who 'capish' (grok) 
the inner workings of FOP.

Web Maestro Clay


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-21 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-21 17:21 ---
Created an attachment (id=11625)
LineLayoutManager.java


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-21 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-21 17:22 ---
Created an attachment (id=11626)
TextLayoutManager.java


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:31 ---
Created an attachment (id=11602)
patch to existing files


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:31 ---
Created an attachment (id=11603)
KnuthElement.java


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:32 ---
Created an attachment (id=11604)
KnuthBox.java


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:32 ---
Created an attachment (id=11605)
KnuthGlue.java


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:32 ---
Created an attachment (id=11606)
KnuthPenalty.java


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:33 ---
Created an attachment (id=11607)
KnuthPossPosIter.java


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:34 ---
Created an attachment (id=11608)
fo test file (text-indent)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:34 ---
Created an attachment (id=11609)
fo test file (text-align-last)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:35 ---
Created an attachment (id=11610)
fo test file (word-spacing and letter-spacing)


DO NOT REPLY [Bug 29124] - New line breaking algorithm

2004-05-20 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=29124.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29124

New line breaking algorithm





--- Additional Comments From [EMAIL PROTECTED]  2004-05-20 15:36 ---
Created an attachment (id=11611)
fo test file (long paragraphs of text)