Re: Q: Line-layout - separating element-list creation / line-breaking

2006-08-24 Thread Andreas L Delmelle

On Aug 23, 2006, at 21:59, Andreas L Delmelle wrote:


snip /
The other consideration, but that would be for a more distant  
future, is to be able to have three different threads:

- fo creation (a)
- base layoutengine initialization (b)
- actual breaking/layout (c)


A bit more elaboration:
This could actually be done in a single thread, chopping the process  
up into discrete parts, and bouncing control from (a) to (b) to (c)  
and back.
Implementing this with multiple threads would be much trickier, since  
the access to all the lists needs to be synchronized.


The main difference between total-fit and first-fit would be as to  
how far all the lists ultimately grow. Even then, suppose that we  
have computed the possible breaks for the first ten pages, what is  
the probability that even a total-fit algorithm would need to revisit  
the scores?


The direction I'm thinking in, lies somewhere between first- and  
total-fit. For relatively short sequences --say 20 pages max.-- FOP's  
default behavior would offer the same result as a real total-fit.  
What we also have to keep in mind, though, is the possibility of  
arbitrarily large sequences. An unconditional total-fit might become  
next to impossible to deal with for systems with average memory specs.
I'm thinking this could be controlled by a configuration option: the  
maximum number of pages that is considered before the finishing part  
of the layout phase is triggered.
A threshold of zero would mean that the areas are added immediately  
upon reaching the first break --literal first-fit-- and there is  
neither a need nor a possibility to revisit the breaks on earlier  
pages. Setting this threshold to say 500, would make sure that one  
always achieves the best possible layout for 500 subsequent pages,  
provided that memory constraints allow it.


Just so as not to confuse a total-fit algorithm with performing  
everything in one big loop. While the needed context is much larger,  
it still can be gathered piece-by-piece. Only, certain decisions need  
to be deferred until the complete/total context is known.


The important part is that the opportunity would be opened for any  
type of algorithm to release parts of the FOTree and the layout tree  
earlier, by constraining the window to a maximum number of pages.


The question is whether the base element list creation could be  
considered a part of the layout-initialization process, or whether it  
belongs in the actual layout phase. Auto-layout for tables is one  
particular area where it may prove interesting to consider the base  
element list creation as a preparatory step rather than as a part of  
the layout process itself.


Maybe there are others?


Later,

Andreas



DO NOT REPLY [Bug 40308] New: - RFE: FOP throws a validation exception when it finds duplicate IDs in the XSL-FO

2006-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=40308.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=40308

   Summary: RFE: FOP throws a validation exception when it finds
duplicate IDs in the XSL-FO
   Product: Fop
   Version: 0.92
  Platform: Other
OS/Version: other
Status: NEW
  Severity: enhancement
  Priority: P2
 Component: fo tree
AssignedTo: fop-dev@xmlgraphics.apache.org
ReportedBy: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]


FOP 0.92beta throws a ValidationException if it finds that several XSL-FOs  
  have the same ID and this, even when parameter strict-validation has been
  set to false.

  Modular documents built using XInclude often contain duplicate IDs. For
  example, a modular document may contain three instances of the same table
  found at three different places in the document. There is no error that
  could be fixed in the source XML in such case.

  In all cases, IMHO, it does not make sense for a XSL-FO formatter such as
  FOP to completely stop working when it finds a validity error as benign as a
  duplicate ID.

Changes made in FOP 0.92beta's src/java/org/apache/fop/fo/FObj.java: 
added test if (getUserAgent().validateStrictly()).

===
protected void checkId(String id) throws ValidationException {
if (!id.equals()) {
Set idrefs = getFOEventHandler().getIDReferences();
if (!idrefs.contains(id)) {
idrefs.add(id);
} else {
if (getUserAgent().validateStrictly()) {
throw new ValidationException(
Property id \ + id 
+ \ previously used; id values must be unique
+  in document., locator);
}
}
}
}
===

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


DO NOT REPLY [Bug 40271] - auto table layout -- dirty draft

2006-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=40271.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=40271


[EMAIL PROTECTED] changed:

   What|Removed |Added

  Attachment #18739|0   |1
is obsolete||




--- Additional Comments From [EMAIL PROTECTED]  2006-08-24 16:02 ---
Created an attachment (id=18749)
 -- (http://issues.apache.org/bugzilla/attachment.cgi?id=18749action=view)
New patch now avoiding static variables for min and max widths

This patch doesn't make use of the static variables in table-helper anymore.
Unfortunately it still uses the TableHelper.calculateMode static variable but I
will be removing that soon.

I'd rather update frequently even if there are upcoming changes.

Cheers,

Patrick

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


Re: svn commit: r433810 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop/fo: FOText.java FObjMixed.java

2006-08-24 Thread Andreas L Delmelle

On Aug 24, 2006, at 15:08, Chris Bowditch wrote:

Hi Chris,


snip /
Well I'm not sure if its one of your recent changes thats to blame,  
but I've just noticed that simple tables (test file attached)  
appear to be issuing a SEVERE message to the log, i.e.


24-Aug-2006 14:04:41 org.apache.fop.datatypes.LengthBase getBaseLength
SEVERE: getBaseLength called without context


Just checked this, and it seems that this has to do with the  
percentage ipd. Somehow, propertyList.get().getLengthRange() tries to  
resolve the percentage at that point, but does not have any context  
yet to do so...


I'll look a bit deeper into it. See if we can avoid the error message  
somehow.
This could have been there for quite some time. The percentage  
resolution is attempted again during layout, and won't give an error  
message there, because the context is available. So the tests would  
pass.



Later,

Andreas



Re: Q: Line-layout - separating element-list creation / line-breaking

2006-08-24 Thread Simon Pepping
On Wed, Aug 23, 2006 at 09:59:14PM +0200, Andreas L Delmelle wrote:
 On Aug 23, 2006, at 21:16, Patrick Paul wrote:
 
 Simon's correct. I'll clarify a bit more. From what I could tell,  
 Patrick's patch has added code at the right places, it only seems a  
 bit awkward to me to import stuff related to table-layout into the  
 LineLayoutManager (even if it is nothing more than a TableHelper).

I had not realized you were talking about tables. I thought it was
connected with the need to find a layout solution for page sequences
with different page widths.
 
 (Note: this part was not wrong per se in the patch. It just seems  
 there currently is no other way, precisely because list-creation and  
 breaking are performed in the same method. The only place where it  
 can be entered is somewhere between those two statements -- 
 collectInlineKnuthElements() and createLineBreaks().)
 
 A more correct approach --a matter of taste?-- would be for the  
 TableContentLM to 'collect' the accumulated list of its descendant  
 LMs, perform the min/max-width calculation, update the LayoutContext,  
 and send it back down to the LineBreaker.
 
 Maybe someone sees another approach that I'm overlooking?

The LineLM is the master of the inline content, so to say, so it
governs the line breaking process. Therefore it may be the right
object to be requested to calculate the width of the content. But
maybe it would be better to add a getTotalWidth method to the
Paragraph.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu


Re: Q: Line-layout - separating element-list creation / line-breaking

2006-08-24 Thread Simon Pepping
On Thu, Aug 24, 2006 at 12:40:25PM +0200, Andreas L Delmelle wrote:
 On Aug 23, 2006, at 21:59, Andreas L Delmelle wrote:
 
 snip /
 The other consideration, but that would be for a more distant  
 future, is to be able to have three different threads:
 - fo creation (a)
 - base layoutengine initialization (b)
 - actual breaking/layout (c)
 
 A bit more elaboration:
 This could actually be done in a single thread, chopping the process  
 up into discrete parts, and bouncing control from (a) to (b) to (c)  
 and back.
 Implementing this with multiple threads would be much trickier, since  
 the access to all the lists needs to be synchronized.

An interesting side effect of the total-fit solution to page breaking
is that no interaction is required between line and page breaking. The
two are independent and one is performed before the other. That
changes when the line layout depends on the page layout, e.g. due to
different page widths or due to side floats. This makes the
programming more complex.

The same problem would be introduced by your above process. Bouncing
control between line and page layout introduces an interaction between
the two.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu


DO NOT REPLY [Bug 40308] - RFE: FOP throws a validation exception when it finds duplicate IDs in the XSL-FO

2006-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=40308.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=40308





--- Additional Comments From [EMAIL PROTECTED]  2006-08-24 19:10 ---
I always thought that uniqueness of ids is a constraint enforced by the XML 
Specification. XSL-FO being 
subject to the constraints defined in the XML Rec., it should report this as an 
error...?

No, wait: the type of the id property should be id, which means, if you look 
at the Rec (5.11 Property 
Datatypes), that it should be an NCName that is unique... within the 
_stylesheet_?

Other opinions?

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


DO NOT REPLY [Bug 40308] - RFE: FOP throws a validation exception when it finds duplicate IDs in the XSL-FO

2006-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=40308.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=40308





--- Additional Comments From [EMAIL PROTECTED]  2006-08-24 19:30 ---
I agree with Chris and Andreas. The XSL-FO spec is very clear about the
requirement of uniqueness of the values of the id attribute. You could obtain a
valid FO file by not propagating id attributes from your XML file to the FO
file. OTOH, I agree that it is a feature that we may choose to suppress under
lax validation, because the FO file is still quite renderable, apart from
linking to internal destinations. I definitely do not agree with the idea that
one should cope with illegal XSL-FOs due to multiply defined IDs. But if users
choose to do so, they should be given that freedom, I guess.


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


DO NOT REPLY [Bug 40271] - auto table layout -- dirty draft

2006-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=40271.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=40271





--- Additional Comments From [EMAIL PROTECTED]  2006-08-24 21:34 ---
Hi Patrick,

Thanks for the updates.
Just to let you know: I'll look further into your patch ASAP. (If you've been 
following the commit list, you'll 
notice I'm getting a bit tired ;)) Will be back tomorrow.

Cheers,

Andreas

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


Re: svn commit: r433810 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop/fo: FOText.java FObjMixed.java

2006-08-24 Thread Andreas L Delmelle

On Aug 24, 2006, at 19:45, Andreas L Delmelle wrote:


On Aug 24, 2006, at 15:08, Chris Bowditch wrote:

Hi Chris,


snip /
Well I'm not sure if its one of your recent changes thats to  
blame, but I've just noticed that simple tables (test file  
attached) appear to be issuing a SEVERE message to the log, i.e.


24-Aug-2006 14:04:41 org.apache.fop.datatypes.LengthBase  
getBaseLength

SEVERE: getBaseLength called without context


Just checked this, and it seems that this has to do with the  
percentage ipd. Somehow, propertyList.get().getLengthRange() tries  
to resolve the percentage at that point, but does not have any  
context yet to do so...


Got it. Was introduced with checking bpd/ipd for negative values  
(r433385).


Should be fixed now.

Cheers,

Andreas


DO NOT REPLY [Bug 40308] - RFE: FOP throws a validation exception when it finds duplicate IDs in the XSL-FO

2006-08-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=40308.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=40308


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED




--- Additional Comments From [EMAIL PROTECTED]  2006-08-24 20:35 ---
OK. I added this to the codebase. If strict validation is turned off, FOP will 
issue a warning to the logger 
and continue processing.

included in http://svn.apache.org/viewvc?rev=434513view=rev

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


Re: Q: Line-layout - separating element-list creation / line-breaking

2006-08-24 Thread Andreas L Delmelle

On Aug 24, 2006, at 21:01, Simon Pepping wrote:


On Thu, Aug 24, 2006 at 12:40:25PM +0200, Andreas L Delmelle wrote:

This could actually be done in a single thread, chopping the process
up into discrete parts, and bouncing control from (a) to (b) to (c)
and back.
Implementing this with multiple threads would be much trickier, since
the access to all the lists needs to be synchronized.


An interesting side effect of the total-fit solution to page breaking
is that no interaction is required between line and page breaking. The
two are independent and one is performed before the other. That
changes when the line layout depends on the page layout, e.g. due to
different page widths or due to side floats. This makes the
programming more complex.


Indeed it would, but not necessarily increasing the computational  
complexity...


Rough sketch:
Page-breaking initializes first, and prefetches say five blank pages.
From these it constructs one long context, call it one big page --or  
better: one region-body--, with ipd changes at a known set of  
coordinates.


This context is then passed to the FlowLM, and further down. If the  
LineLM is aware of its bp-coordinate, the LineBreaker will know about  
changes in ipd, but it does not need to know that it is a page-break,  
which it isn't at this point.


Coming back up means either overflow of the context, or no more content.
In the latter case control is handed over to the PageBreaker.
In the first case, for a total-fit solution the set of active nodes  
is kept active, the next five pages are fetched, and the process  
repeats, until the content runs out, at which point the PageBreaker  
kicks in. A more memory-friendly algorithm could decide to run the  
PageBreaker sooner, and continue the process with the last best node  
as a starting-point.



Cheers,

Andreas