From: "Keiron Liddle" <[EMAIL PROTECTED]>

> On 2002.03.19 14:45 Nicola Ken Barozzi wrote:
> > I would consider the possibility (configurable) of having FOP make just
> > sensible assumptions to continue processing and sacrifice some things it
> > should do later.
>
> That sounds very vague.
> So what will you do if someone has a table of contents as the second page.
> Will you pass across a whole lot of <to be resolved things> for the page
> numbers to the renderer, so the renderer needs to do the resolving from
> some further information. Will the renderer go until it reaches a <to be
> resolved thing> then need to do some awkward processing of all the
> following pages.

I would output a "?" instead of the real pages.
While this is not generally acceptable, in reports, for example, these
references are usually not needed.
High memory consumption is *the*killer*, and it's a burden that
is not needed in these cases.

> It all sounds like an extra level of complexity that we really don't need.
> Rather than helping us it will simply make things more difficult.

Good point. I will keep this in mind.

> > > - embedded xml will need to be parsed twice and saxified
> >
> > Why twice?
>
> May need to read it to get the width and height and do some pre-processing
> with a DOM.

It can be equally done by a sub-pipeline with SAX. But I understand that
Batik is still DOM based :-(

> > ATM, I don't have a clear list of all the things that need to be held
> > back
> > before resolution.
> > Is there a list somewhere?
> > It would be of great help for me.

Since propertiy resolution is basically inherited AFAIK, it seems that what
you specify is in fact what really breaks the nice SAX stuff.

> Page references.

Yes.

> Internal links.

May be. In HTML it's not needed since I can write an internal link (#myref)
before specifying it.

> Retrieve Markers?

IMHO yes, since you can make a forward reference with it.

> Extensions.

Let's bypass extensions for now.


So we absolutely need to stop output and cache events if there is a forward
reference.
If we find it at the end, all the pages must remain in memory, and now I see
that this can make FOP behave no better than it does now.

There are three possibilities I see:

1. Start storing the SAX events as soon as a forward reference is caught,
and flush them after the resolved reference.

2. Ignore the forward references (speed property); if someone doesn't use
any of the above features, there is no difference in output anyway.

3. Do the FOP processing in 2 steps.
   1- Process all the stuff without the references writing to disk
   2- resolve them by rereading the file
       This resolves memory issues but not speed.

I would go for 1 and 2.
If we store the SAX events before firing them, they are smaller than DOM and
can be saved to a Store that can also be a disk in case of low memory.
This is how Cocoon caches pages, and this is how we could cache SAX
fragments that are there just to wait for a forward reference.

We can also give a clear indication to users to how to optimize pages: no
forward references = much less memory = higher speed.

Anyway, these forward references are a pain in the ass :-/

--
Nicola Ken Barozzi                   [EMAIL PROTECTED]
            - verba volant, scripta manent -
   (discussions get forgotten, just code remains)
---------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to