Re: FOP memory consumption

2002-05-02 Thread Keiron Liddle
On 2002.05.01 16:34 Bernd Brandstetter wrote:
So, I have two questions/suggestions:
1) Wouldn't it be possible to let FOP create the output in two steps like
for instance (La)TeX does. Doing a dry run first only to calculate the
page references, store them somewhere, and then produce the actual output
in a second run.
As outlined on this page this is the approach that we are heading for:
http://xml.apache.org/fop/design/optimise.html
This should make any size document possible.
2) Are there plans to port FOP to C/C++ sometime? I guess that at least
part of the memory consumption is to be blamed on Java and IIRC the
underlying Xerces and Xalan are already available as C++ versions, so why
not FOP?
The issue of implementing FOP is not about the language. Since using java 
means we already have a large number of services available and reduced 
debugging effort then this is a logical choice (this doesn't prevent other 
choices).

The issue is dealing with the large number of elements, properties and 
layout issues.
Once the real problem is solved then it will be a more relevant question.


Re: FOP memory consumption

2002-05-01 Thread J.Pietschmann
Bernd Brandstetter wrote:
memory would have to be available on every box. From what I've read on the 
list, I'm sure this is due to excessive usage of forward references and 
large (partly nested) tables spanning multiple pages. However, this is an 
absolute requirement for our documentation.
From my experience, absolute requirements often vanish in
smoke after you told how much it will cost. There are a few
cases where page X of Y is required, for example for certain
legal documents, more often than not it is required just
because people got used to it. Also, a TOC often can be put
at the end of the PDF, people can shuffle the paper by themself
if they need. For people reading on screen, a TOC is almost
always of much less use than bookmarks anyway.
Of course, there are the hard cases of see page X, where
there is no way to avoid forward references.
Even fairly big tables usually aren't nearly as much of a problem
than forward references (particularly those the last page), however,
the bulk of the data necessary for rendering the table will
be freed only after the page where the table ends has been
rendered. That's the rationale behind avoid tables spanning
pages. In reports, where the rows are fairly regular, you
could try to implement your own pagination at the XSLT level.

So, I have two questions/suggestions:
1) Wouldn't it be possible to let FOP create the output in two steps like 
for instance (La)TeX does. Doing a dry run first only to calculate the 
page references, store them somewhere, and then produce the actual output 
in a second run.
Yes, this has already been discussed a few times. You could
implement an extension element which writes the current page
in an XML file and use this whereever you have elements reffered
to, for example replace
  fo:block id=img-21...
by
  fo:blockfox:writepage id=img-21/...
and then refer to this file using document() in the second
XSLT pass for putting page numbers into the result, for example
instead of
  xsl:template match=xref
xsl:textSee page /xsl:text
fo:page-number-citation ref-id=[EMAIL PROTECTED]/
  /xsl:template
use
  xsl:template match=xref
xsl:textSee page /xsl:text
xsl:value-of select=document('pgnum.xml')/*/page[id=current()/@refid/
  /xsl:template
2) Are there plans to port FOP to C/C++ sometime? I guess that at least 
part of the memory consumption is to be blamed on Java and IIRC the 
underlying Xerces and Xalan are already available as C++ versions, so why 
not FOP?
There is a lot of potential for optimisations still in
Java. Currently the efford is concentrated on making
FOP conformant to the spec and implement all necessary
stuff, streamlining will mainly be deferred until later.
Bear in mind that C/C++ is not necessarily better with regard
to memory consumption than Java, and can be significantly
harder to debug. And no amout of optimisation will fix
memory problems if you refer to the last page on the first.
Even the expensive commercial FO processors suffer from this.
Hope this clarifies some matters.
J.Pietschmann