DO NOT REPLY [Bug 46048] Wrong images used (how to clear image cache?)

2008-10-23 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=46048





--- Comment #18 from Jeremias Maerki [EMAIL PROTECTED]  2008-10-23 04:48:08 
PST ---
Created an attachment (id=22771)
 -- (https://issues.apache.org/bugzilla/attachment.cgi?id=22771)
Proposed patch against FOP Trunk for URI pre-resolution

As promised I've looked into it (had some precious train time yesterday). I
found two possible approaches to pre-resolve the URI relative to a base URI,
so the image cache gets more absolute URIs. My first attempt was to build that
into the image loading framework but that caused a lot of changes (even API
changes). The second attempt is less invasive but needs changes in more than
one place in FOP (ExternalGraphic and all renderers). To illustrate this I've
just patched the PDFRenderer for the moment.

The patch uses java.net.URI (since Java 1.4) to do the URI resolution (using
URI.resolve(URI), not JAXP-style URIResolver resolution!). That seems to do the
job just fine. I'm not 100% sure this is ultimately the right approach which is
why I'm just posting a proposed patch here rather than doing the change
directly. The change itself should be pretty safe because if there's a problem
parsing the URI, the original URI is simply returned. Only relative URIs should
be affected.

The patch requires the Base URI (FOUserAgent.setBaseURL(String)) to be set for
the document. From the command-line this will be done automatically (the source
file's directory is used).

Feedback and further ideas welcome.


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 46048] Wrong images used (how to clear image cache?)

2008-10-23 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=46048


Jeremias Maerki [EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|WORKSFORME  |




--- Comment #19 from Jeremias Maerki [EMAIL PROTECTED]  2008-10-23 04:56:12 
PST ---
BTW, just to explain what happens with this patch:

If you have src=chart.svg on your external-graphic and the base URI is
file:/C:/reports/321cb123db23/, the image loader framework receives as URI:
file:/C:/reports/321cb123db23/chart.svg. Before the patch it would only
receive chart.svg.

Maybe it would actually be better to delay the pre-resolution as long as
possible, i.e. to do it inside the image loader framework (my first approach).
But it would still require a change for ExternalGraphic and all renderers
because the currently applicable base URI is passed along.

If someone wanted to go even further, support for xml:base
(http://www.w3.org/TR/xmlbase/) could be added to FOP to override the base URI
for certain elements. Should be too hard to implement.


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


Re: Active node tree pruning in FOP trunk

2008-10-23 Thread Simon Pepping
Hi Dario,

This is an interesting study. I need some more time to understand the
implications fully. At first sight you prove that for normal documents
the improvement is small. The paragraphs need to get long before your
strategy makes a difference. This is interesting, however, for long
chapters with many pages, as you mentioned in your earlier email.

It is clear why long paragraphs make a difference. Why does one- or
two-column layout make a large difference? Simply due to the twice
larger number of pages? I do not understand the left-aligned case. Is
this not just the same as a first-fit layout?

A more theoretical measurement would be the maximum number of active
nodes.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu


DO NOT REPLY [Bug 46048] Wrong images used (how to clear image cache?)

2008-10-23 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=46048





--- Comment #20 from M.H. [EMAIL PROTECTED]  2008-10-23 14:37:44 PST ---
Wow, Jeremias! Thanks for working in this! I guess I have to find out how to
get the latest developent version of FOP and how to compile it. I would like to
see, if your patch fixed my specific problem ...


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.



Re: Active node tree pruning in FOP trunk

2008-10-23 Thread Dario Laera

Hi Simon,

thanks for your reply.

Il giorno 23/ott/08, alle ore 21:43, Simon Pepping ha scritto:


Hi Dario,

This is an interesting study. I need some more time to understand the
implications fully. At first sight you prove that for normal documents
the improvement is small. The paragraphs need to get long before your
strategy makes a difference. This is interesting, however, for long
chapters with many pages, as you mentioned in your earlier email.


ATM I prefer to talk about paragraphs only: in the test I've done  
today I saw that for page breaking there is always just one active  
node. So it's clear why formatting the xsl-fo recommendation, that is  
over 400 pages long but with short para, doesn't get faster. I need to  
investigate in this area.



It is clear why long paragraphs make a difference. Why does one- or
two-column layout make a large difference? Simply due to the twice
larger number of pages? I do not understand the left-aligned case. Is
this not just the same as a first-fit layout?


Nice questions... I'm trying to understand this behavior too, the  
first time I've implemented the pruning on prototype was for another  
reason and I accidentally noticed the performance boost :)
About one or two columns, or better, long or short lines: again, I  
don't know why, maybe it's just because the double number of breaks; I  
thing I noted is that for the same number of active node with shorter  
lines the gap between startLine and endLine is wider than with long  
lines. I don't know if this is meaningful.
About left-aligned or justified: with the latter *sometimes* having  
threshold=1.0 is enough (I think because of stretchable glues) so  
obviously the number of active node is reduced, while the former will  
always fall in threshold=20.0 and in force mode (talking about my  
tests). Anyway, while I'm not sure short/long lines really makes  
difference, it's evident that non justified text produce a lot more of  
active nodes than justified ones.
I hope to give you some decent answer in the next days. Precise  
answers faster than mine would be also appreciated :P



A more theoretical measurement would be the maximum number of active
nodes.


In stat-nopruning.txt you find the maximum number of active nodes for  
each paragraph without pruning (max value), th is threshold and lines  
is the line count for the final layout. The last line for each test  
file doesn't matter because is referred to page breaking.
Today I developed a kind of auto-activating/regulating pruning: when  
the number of active nodes exceeds a threshold (I used 300) the  
pruning get activated, and the treeDepth (TD) is chosen as the mean  
between startLine and endLine. Initially I was setting TD to  
startLine, but then I noticed that in short line the pruning were  
activated when startLine was 5 and endLine was 44 (!), so I decided  
that the mean was a better choice. I can't explain how it's possible  
that the same text can be laid out in 5 short lines (I'm talking about  
2 columns in A4) and in 44 lines...

You can find statistics from auto pruning in the other file attached.

I will try to produce accurate graphs that outlines the variables  
trend, hoping that will help understanding some behaviors.


Dario


##
# max = max value for activeNodeCount
# sl = startLine
# el = endLine
# line = line number of the node that has exceeded the activeNodeCount
# threshold
# td = the treeDepth to be used
#
## Trasform fo/my_franklin_rep-1blk-2c-jus.fo without pruning
Active pruning max = 301sl = 59 el = 93 line = 66   td = 76
REDUCE pruning max = 338sl = 76 el = 117line = 78   td = 50
findBreakinPoints max = 368 th = 20.0   lines = 544 forced
findBreakinPoints max = 1   th = 1.0lines = 15  forced
   30.06 real 7.92 user 0.73 sys

## Trasform fo/my_franklin_rep-1blk-2c.fo without pruning
Active pruning max = 301sl = 5  el = 44 line = 24   td = 24
REDUCE pruning max = 301sl = 30 el = 65 line = 56   td = 16
REDUCE pruning max = 302sl = 30 el = 65 line = 57   td = 10
REDUCE pruning max = 301sl = 35 el = 67 line = 63   td = 6
REDUCE pruning max = 302sl = 35 el = 67 line = 64   td = 4
findBreakinPoints max = 1446th = 20.0   lines = 561 forced
findBreakinPoints max = 1   th = 1.0lines = 16  forced
   31.04 real 8.07 user 0.74 sys

## Trasform fo/my_franklin_rep-1blk-jus.fo without pruning
findBreakinPoints max = 61  th = 1.0lines = 240
findBreakinPoints max = 1   th = 1.0lines = 7   forced
   28.88 real 7.05 user 0.72 sys

## Trasform fo/my_franklin_rep-1blk.fo without pruning
Active pruning max =