AW: [RT] Proprietary extension to fo:external-graphic

2002-11-08 Thread J.U. Anderegg
Considering PDF only, I see prefabricated image XObjects as a very powerful
feature.

Extracting image XObjects from PDF files and storing them for use by the
renderer brings two advantages:

a) saves CPU and memory at a maximum

b) the user controls image representation/handling in PDFs.

Writing an extract program and progamming the renderer is straightforward,
caching is solved for PDF. Remains to tell FOP how to handle these
external-graphics.

Hansuli Anderegg

Sample image XObject to be inserted into PDF file by PDF renderer

21 0 obj === adjust PDF object ID
/Type /XObject
/Subtype /Image
/Name /Im6
/Length 89957 === original PNG file is 56KB
/Width 256  === in pixels
/Height 256 === in pixels
/BitsPerComponent 8
/ColorSpace /DeviceRGB
/Filter [ /ASCII85Decode /FlateDecode ]

stream
GarT@BuGTQ2\N




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-07 Thread Oleg Tkachenko
J.Pietschmann wrote:


I think it would be prudent to follow the same for
fo:external-graphics and fo:color-profile, on the ground that
FOs may be rendered out of order and, even more important, it is
not clear whether multiple renderings of an external graphic in a
static content, table header/footer or a marker should result in
multiple access to the source. Unfortunately, the spec doesn't even
mention this issue.


btw, what about raising the issue on xsl-editors? Definitely a lot of things 
are implied, but actually xsl spec says just nothing.

- Caching images across renderings definitely is an issue too (think of
  the company logo in each page header in every document), but FOP
shouldn't
  solve this. I imagine a SourceResolver interface which gets an URL and
  optional content type and returns a XMLReader/InputSource pair.
  In case of binary image formats the default implementation returns a null
  parser.
  People who want to cache images across renderings can implement their
  own resolver which can do the caching. The Cocoon crowd will certainly
  rejoice (no more memory leaks due to FOP caching, access to Cocoon
  caching and Cocoon internal pipelines and other advantages).


Good idea, worth to be added to the feature request list in order not to be 
forgotten.

- Fine tuning: A single large image will block a lot of memory during
  rendering. A possibility is a fox:cache=no control property. In order
  to preserve semantics, a null image is cached for this URL, and an error
  is generated in case it is attempted to render the image a second time.


I don't get it a little bit, why error should be generated? What's wrong with 
reloading an image each time it's referenced?

- Dynamic URLs. In order to achive this, we can extend the functions
available
  in property expressions by concat() and page-number(). 

This one looks dubious for me. Can we add any new functions to the core 
library? Extension functions in different namespace like we used to in xpath 
are certanly not allowed in xsl as FunctionName here is NCName in contrast to 
QName in xpath. One more fault in the spec I think. :(

--
Oleg Tkachenko
eXperanto team
Multiconn Technologies, Israel


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]



Re: [RT] Proprietary extension to fo:external-graphic

2002-11-07 Thread Jeremias Maerki

On Wed, 06 Nov 2002 23:06:53 +0100 J.Pietschmann wrote:
snip/
 Conclusions and ideas so far:
 - FOP should cache external graphics during a rendering and by default
clear the cache afterwards.

ok. 

 - Caching images across renderings definitely is an issue too (think of
the company logo in each page header in every document), but FOP shouldn't
solve this. I imagine a SourceResolver interface which gets an URL and
optional content type and returns a XMLReader/InputSource pair.
In case of binary image formats the default implementation returns a null
parser.
People who want to cache images across renderings can implement their
own resolver which can do the caching. The Cocoon crowd will certainly
rejoice (no more memory leaks due to FOP caching, access to Cocoon
caching and Cocoon internal pipelines and other advantages).

But the SourceResolver approach will only let you cache the binary
representation of an image, quite often it still has to be decoded each
time it is used, which costs CPU power. Right?

 - Fine tuning: A single large image will block a lot of memory during
rendering. A possibility is a fox:cache=no control property. In order
to preserve semantics, a null image is cached for this URL, and an error
is generated in case it is attempted to render the image a second time.

So, I may not be so far off the mark after all.

 - Dynamic URLs. In order to achive this, we can extend the functions available
in property expressions by concat() and page-number(). I believe both would
be welcome by many users for other purposes too (whether that't a good idea
is another matter). One of the possible concerns are usually name clashes
with future XSLFO extensions. Using prefixed identifiers like fox:concat()
would be a solution, I'm somewhat uneasy with using XML namespace mechanisms
within values for XML attributes. In fact, I think its abuse, but I can't
offer much better ideas either.

I think you've got me wrong what I meant with dynamic URLs. I called the
URL dynamic if the same URL can deliver a different content with each
call. For example something similar to your example: http://ts.com/get-time.cgi
Each time the URL gets called it returns a different image showing a
clock giving the current time.

It's a problematic discussion, somewhat. We're talking about image
caching but there are at least two separate kinds:
- Caching of images inside one processing run (which I consider the
  renderer's duty to a certain degree. Of course, the layout engine has
  to determine the extents of the image before the renderer goes into
  action)
- Caching of images over a longer time and between processing runs. I
  agree that a specialized SourceResolver is a good thing but I still
  wonder about the decoding work.
  
I was primarily wondering about the second kind of caching, but the
discussion went stringly towards the first kind. Anyway, I'm still
somewhat unsure about all this.
  
Maybe we have to set up a new page in Betrand's Wiki to create a little
specification for the image caching. This would also help as a
discussion base if we have to contact the XSL:FO WG as Oleg suggests.

Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-07 Thread Oleg Tkachenko
Jeremias Maerki wrote:


But the SourceResolver approach will only let you cache the binary
representation of an image, quite often it still has to be decoded each
time it is used, which costs CPU power. Right?


I think so. But nevertheless that would be a cool feature. Consider such a 
real use case: one have image stored in an application jar file. At the moment 
 I think FOP cannot handle such case, but having SourceResolver we can 
delegate source URI resolving to a user, like URIResolver does and one can 
easily return us kind of stream, e.g.
new InputSource(getClass().getResourceAsStream(/path/to/foo.gif))

--
Oleg Tkachenko
eXperanto team
Multiconn Technologies, Israel


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]



Re: [RT] Proprietary extension to fo:external-graphic

2002-11-07 Thread J.Pietschmann
Jeremias Maerki wrote:

- Caching images across renderings

But the SourceResolver approach will only let you cache the binary
representation of an image, quite often it still has to be decoded each
time it is used, which costs CPU power. Right?


Right.
Next try: provide a layered set of interfaces:
- SourceResolver: resolves URI to XMLReader+InputSource. Used for, well,
  source resolving for graphics, color profiles, fonts, font metrics,
  perhaps config files, whatever.
  Can be hooked into for URI mapping, custom protocols, on the fly
  generation, simple caching.
  Default implementation similar to the common
  javax.xml.transform.URIResolver, with a few twists (must peek into the
  stream to check for XML unless forced by content type).
- ImageResolver: resolves a URI+some properties (which?) into a FOPImage.
  Default implementation uses SourceResolver to get a stream and whether
  it is an XML stream, detects image type (unless forced by content type).
  Can be hooked into for advanced image caching (still call the default
  implementation for doing the image creation).
- FontResolver: Same for fonts.
- FontMetricsResolver: for completeness, or fold this into the FontResolver.
- ColorProfileResolver, : Just to be complete, or use SourceResolver
  directly.


- Fine tuning: A single large image will block a lot of memory during
  rendering. A possibility is a fox:cache=no control property. In order
  to preserve semantics, a null image is cached for this URL, and an error
  is generated in case it is attempted to render the image a second time.



So, I may not be so far off the mark after all.


Revised thoughts: Two control attributes
- tentatively: fox:cache
   + yes (default): keep the FOPImageObject (for this rendering run)
   + no: discard it immediately after rendering.
  Use this to prevent large images which occure only once to take up
  memory indefinitely.
  Problem: how should this be handled in static content, markers,
  table headers/footers with omit-header-at-break=false? Perhaps
  discard with FO rather than discard after rendering?
- tentatively: fox:access
   + once (default): do not access the source if it has already been
 accessed, if there is no cached FOPImage, raise an error
   + use-cached: do not access the source if there is a cached
 FOPImage, else reload
   + on-creation: access source while creating this FO unconditionally,
 replace cached image if there is one.
   + on-rendering: access source each time this FO is rendered.
Don't ask me how this should work together with the resolver stuff above.
Perhaps the fox:access stuff is overengineering, don't take it too serious.


- Dynamic URLs.

I think you've got me wrong what I meant with dynamic URLs.

I got it quite right. I should have mentioned I wanted to supply
a mechanism which allows the construction of different URIs in case
someone wants to use images for page numbers.


Maybe we have to set up a new page in Betrand's Wiki to create a little
specification for the image caching. This would also help as a
discussion base if we have to contact the XSL:FO WG as Oleg suggests.


Neat idea.

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-07 Thread J.Pietschmann
Oleg Tkachenko wrote:

- Fine tuning: A single large image will block a lot of memory during
  rendering. A possibility is a fox:cache=no control property. In order
  to preserve semantics, a null image is cached for this URL, and an 
error
  is generated in case it is attempted to render the image a second time.


I don't get it a little bit, why error should be generated? What's wrong 
with reloading an image each time it's referenced?

Because it breaks the (FOP) specified semantics that the image is
always the same in case the source chooses to supply a different
image on each access. But see the other post.


- Dynamic URLs. In order to achive this, we can extend the functions
available
  in property expressions by concat() and page-number(). 


This one looks dubious for me. Can we add any new functions to the core 
library?
Can we? Sure.
Is it wise to do so? Oh well, get me an asbestos suit quickly!


Extension functions in different namespace like we used to in 
xpath are certanly not allowed in xsl as FunctionName here is NCName in 
contrast to QName in xpath. One more fault in the spec I think. :(

Oops! I didn't notice this.
Didn't someone on the XML-DEV list recently mention they prepare for
a 2.0? Seems they have to do a lot of home work for this!
BTW changing NCName - QName is probably considered an incompatible
change, warranting a new release...




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-07 Thread J.Pietschmann
Oleg Tkachenko wrote:

I think so. But nevertheless that would be a cool feature. Consider such 
a real use case: one have image stored in an application jar file. At 
the moment  I think FOP cannot handle such case,

I didn't try myself, but a jar URI should work. Something like
  jar:file:///foo/bar.jar#com/experanto/images/logo.gif
Sorry, I'm too lazy to look up details.

J.Pietschmann



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




[RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread Jeremias Maerki
While investigating the multi-threading issues in the maint-branch I
came across the following:

Currently, in the context of the PDF renderer, every FopImage is closed
as soon as it's written to the target file. The next time the same
image/url is used it has to be reloaded. This is not true for the other
renderers, where the images are really being cached. The calls to
FopImage.close() in PDFXObject are effectively disabling the caching
mechanism. But on the other side it enables the correct working of urls
that deliver dynamic content (only for PDFs), when the same URL can
deliver different content over multiple invocations.

Which brings me to my idea. I don't know if we had that before. Wouldn't
it solve this problem if we defined a proprietary extension for
fo:external-graphic to specify if a given url is not to be cached? The
content-type attribute can obviously not be used for that purpose. How
about this?

fo:external-graphic src=url(http://localhost/mydynamicimage) 
xmlns:fop=http://xml.apache.org/fop; fop:disable-caching=true/

Default for disable-caching would be false.

This could also be useful for the redesign, where we have the same
problem: When can and should we cache an image?

Currently, I'm thinking if I should just delete the FopImage.close()
method, so the behaviour of image handling is the same for all renderers,
but that results in a semantic change for the PDF renderer. To still be
able to serve dynamic images the above would be necessary.

Any thoughts?

Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread Oleg Tkachenko
Jeremias Maerki wrote:


Currently, in the context of the PDF renderer, every FopImage is closed
as soon as it's written to the target file. The next time the same
image/url is used it has to be reloaded. This is not true for the other
renderers, where the images are really being cached. The calls to
FopImage.close() in PDFXObject are effectively disabling the caching
mechanism. But on the other side it enables the correct working of urls
that deliver dynamic content (only for PDFs), when the same URL can
deliver different content over multiple invocations.


If we are talking about one particular formatting invokation scope, I don't 
think anybody would rely on dynamic image generation trying to place different 
images (while with the same URI) on a different pages. Acually the spec says 
nothing about it, but I believe it's up to formatter when/in which order/how 
much times to load images as it follows general xsl side-effect free policy.
It's probably worth to check how other formatters cache images though.

--
Oleg Tkachenko
eXperanto team
Multiconn Technologies, Israel


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]



Re: [RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread Jeremias Maerki
Thanks for answering. Do you have a pointer to some documentation
describing that side-effect free policy?

So, do I get you right that the close() calls can safely be removed
because the semantic change I described is irrelevant? That would be
nice because it's easy to fix.

On Wed, 06 Nov 2002 12:33:11 +0200 Oleg Tkachenko wrote:
 If we are talking about one particular formatting invokation scope, I don't 
 think anybody would rely on dynamic image generation trying to place different 
 images (while with the same URI) on a different pages. Acually the spec says 
 nothing about it, but I believe it's up to formatter when/in which order/how 
 much times to load images as it follows general xsl side-effect free policy.
 It's probably worth to check how other formatters cache images though.

Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread Bertrand Delacretaz
On Wednesday 06 November 2002 09:55, Jeremias Maerki wrote:
. . .
 fo:external-graphic src=url(http://localhost/mydynamicimage)
 xmlns:fop=http://xml.apache.org/fop; fop:disable-caching=true/
. . .

There are some fox: extensions already IIRC (never used them though, but 
http://xml.apache.org/fop/extensions.html says so), so I think new ones 
should be created in a consistent way.

I'm ok with such extensions (we use similar things in jfor), just would like 
to make sure that there is only one extension mechanism.

-Bertrand

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread Oleg Tkachenko
Jeremias Maerki wrote:


Thanks for answering. Do you have a pointer to some documentation
describing that side-effect free policy?


Unfortunately not. xsl requirements and xsl proposal states intensions for xsl 
to be side-effect free language, like its dad dsssl, but as side-effect free 
xslt is now a separate recommendation, xsl-fo is staying in its shadow.

--
Oleg Tkachenko
eXperanto team
Multiconn Technologies, Israel


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]



Re: [RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread Keiron Liddle
On Wed, 2002-11-06 at 12:01, Bertrand Delacretaz wrote:
 On Wednesday 06 November 2002 09:55, Jeremias Maerki wrote:
 . . .
  fo:external-graphic src=url(http://localhost/mydynamicimage)
  xmlns:fop=http://xml.apache.org/fop; fop:disable-caching=true/
 . . .
 
 There are some fox: extensions already IIRC (never used them though, but 
 http://xml.apache.org/fop/extensions.html says so), so I think new ones 
 should be created in a consistent way.

That particular extension fox:... is for the pdf bookmarks. (ie.
http://xml.apache.org/fop/extensions.pdf the pdf viewer should show the
bookmarks)

Currently the extension mechanism is only setup for handling xml
elements and not for attributes. Anyone can add an extension and I don't
really considered it an extension to FO unless you are doing some sort
of fo tree/layout/areatree manipulation.

 I'm ok with such extensions (we use similar things in jfor), just would like 
 to make sure that there is only one extension mechanism.

What sort of jfor extensions are there, what do they do?

 -Bertrand



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread Bertrand Delacretaz
On Wednesday 06 November 2002 12:31, Keiron Liddle wrote:
. . .
 What sort of jfor extensions are there, what do they do?

We have jfor:style to define RTF styles (similar to CSS classes in concept) 
on the generated RTF elements. A concept that does not exist in XSL-FO as 
it doesn't make sense when generating printable documents.

And also jfor:cmd to set options for the jfor processor, currently used for 
special tricks/hacks for keep-with stuff.

-Bertrand

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: [RT] Proprietary extension to fo:external-graphic

2002-11-06 Thread J.Pietschmann
Oleg Tkachenko wrote:

Jeremias Maerki wrote:


Thanks for answering. Do you have a pointer to some documentation
describing that side-effect free policy?


Unfortunately not. xsl requirements and xsl proposal states intensions 
for xsl to be side-effect free language, like its dad dsssl, but as 
side-effect free xslt is now a separate recommendation, xsl-fo is 
staying in its shadow.

Well, XSLT explicitely specified that document() must always return
the same tree during a transformation run. It does not explicitely
say whether the source is only accessed once or multiple times, and,
quite predictable, Xalan accesses the referenced URL every time it
encounters a document() call (even though it seems to discard the
read tree in favor of the cached tree), while Saxon and libxslt
access the URL exactly once.
I think it would be prudent to follow the same for
fo:external-graphics and fo:color-profile, on the ground that
FOs may be rendered out of order and, even more important, it is
not clear whether multiple renderings of an external graphic in a
static content, table header/footer or a marker should result in
multiple access to the source. Unfortunately, the spec doesn't even
mention this issue.
Mind you, there was already a complaint where someone used a
fo:external-graphic in a page footer for images representing page
numbers and of course didn't get what he expected.
In XSLT, for document(), it can be argued that it should be easy to
arrange for an additional dummy parameter in order to have distinct
URLs, for example
  xsl:value-of select=document('http://ts.com/get-time.cgi?start')/
  xsl:call-template name=template-to-profile/
  xsl:value-of select=document('http://ts.com/get-time.cgi?end')/
Of course, nothing prevents the XSLT processor from fetching both
values first and then going on with evaluating the template in between,
therefore this technique is risky at least.
A similar approach oviously wont work for fetching graphics repressenting
page numbers.

Conclusions and ideas so far:
- FOP should cache external graphics during a rendering and by default
  clear the cache afterwards.
- Caching images across renderings definitely is an issue too (think of
  the company logo in each page header in every document), but FOP shouldn't
  solve this. I imagine a SourceResolver interface which gets an URL and
  optional content type and returns a XMLReader/InputSource pair.
  In case of binary image formats the default implementation returns a null
  parser.
  People who want to cache images across renderings can implement their
  own resolver which can do the caching. The Cocoon crowd will certainly
  rejoice (no more memory leaks due to FOP caching, access to Cocoon
  caching and Cocoon internal pipelines and other advantages).
- Fine tuning: A single large image will block a lot of memory during
  rendering. A possibility is a fox:cache=no control property. In order
  to preserve semantics, a null image is cached for this URL, and an error
  is generated in case it is attempted to render the image a second time.
- Dynamic URLs. In order to achive this, we can extend the functions available
  in property expressions by concat() and page-number(). I believe both would
  be welcome by many users for other purposes too (whether that't a good idea
  is another matter). One of the possible concerns are usually name clashes
  with future XSLFO extensions. Using prefixed identifiers like fox:concat()
  would be a solution, I'm somewhat uneasy with using XML namespace mechanisms
  within values for XML attributes. In fact, I think its abuse, but I can't
  offer much better ideas either.

Regards
J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]