AW: [RT] Proprietary extension to fo:external-graphic
Considering PDF only, I see prefabricated image XObjects as a very powerful feature. Extracting image XObjects from PDF files and storing them for use by the renderer brings two advantages: a) saves CPU and memory at a maximum b) the user controls image representation/handling in PDFs. Writing an extract program and progamming the renderer is straightforward, caching is solved for PDF. Remains to tell FOP how to handle these external-graphics. Hansuli Anderegg Sample image XObject to be inserted into PDF file by PDF renderer 21 0 obj === adjust PDF object ID /Type /XObject /Subtype /Image /Name /Im6 /Length 89957 === original PNG file is 56KB /Width 256 === in pixels /Height 256 === in pixels /BitsPerComponent 8 /ColorSpace /DeviceRGB /Filter [ /ASCII85Decode /FlateDecode ] stream GarT@BuGTQ2\N - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
J.Pietschmann wrote: I think it would be prudent to follow the same for fo:external-graphics and fo:color-profile, on the ground that FOs may be rendered out of order and, even more important, it is not clear whether multiple renderings of an external graphic in a static content, table header/footer or a marker should result in multiple access to the source. Unfortunately, the spec doesn't even mention this issue. btw, what about raising the issue on xsl-editors? Definitely a lot of things are implied, but actually xsl spec says just nothing. - Caching images across renderings definitely is an issue too (think of the company logo in each page header in every document), but FOP shouldn't solve this. I imagine a SourceResolver interface which gets an URL and optional content type and returns a XMLReader/InputSource pair. In case of binary image formats the default implementation returns a null parser. People who want to cache images across renderings can implement their own resolver which can do the caching. The Cocoon crowd will certainly rejoice (no more memory leaks due to FOP caching, access to Cocoon caching and Cocoon internal pipelines and other advantages). Good idea, worth to be added to the feature request list in order not to be forgotten. - Fine tuning: A single large image will block a lot of memory during rendering. A possibility is a fox:cache=no control property. In order to preserve semantics, a null image is cached for this URL, and an error is generated in case it is attempted to render the image a second time. I don't get it a little bit, why error should be generated? What's wrong with reloading an image each time it's referenced? - Dynamic URLs. In order to achive this, we can extend the functions available in property expressions by concat() and page-number(). This one looks dubious for me. Can we add any new functions to the core library? Extension functions in different namespace like we used to in xpath are certanly not allowed in xsl as FunctionName here is NCName in contrast to QName in xpath. One more fault in the spec I think. :( -- Oleg Tkachenko eXperanto team Multiconn Technologies, Israel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
On Wed, 06 Nov 2002 23:06:53 +0100 J.Pietschmann wrote: snip/ Conclusions and ideas so far: - FOP should cache external graphics during a rendering and by default clear the cache afterwards. ok. - Caching images across renderings definitely is an issue too (think of the company logo in each page header in every document), but FOP shouldn't solve this. I imagine a SourceResolver interface which gets an URL and optional content type and returns a XMLReader/InputSource pair. In case of binary image formats the default implementation returns a null parser. People who want to cache images across renderings can implement their own resolver which can do the caching. The Cocoon crowd will certainly rejoice (no more memory leaks due to FOP caching, access to Cocoon caching and Cocoon internal pipelines and other advantages). But the SourceResolver approach will only let you cache the binary representation of an image, quite often it still has to be decoded each time it is used, which costs CPU power. Right? - Fine tuning: A single large image will block a lot of memory during rendering. A possibility is a fox:cache=no control property. In order to preserve semantics, a null image is cached for this URL, and an error is generated in case it is attempted to render the image a second time. So, I may not be so far off the mark after all. - Dynamic URLs. In order to achive this, we can extend the functions available in property expressions by concat() and page-number(). I believe both would be welcome by many users for other purposes too (whether that't a good idea is another matter). One of the possible concerns are usually name clashes with future XSLFO extensions. Using prefixed identifiers like fox:concat() would be a solution, I'm somewhat uneasy with using XML namespace mechanisms within values for XML attributes. In fact, I think its abuse, but I can't offer much better ideas either. I think you've got me wrong what I meant with dynamic URLs. I called the URL dynamic if the same URL can deliver a different content with each call. For example something similar to your example: http://ts.com/get-time.cgi Each time the URL gets called it returns a different image showing a clock giving the current time. It's a problematic discussion, somewhat. We're talking about image caching but there are at least two separate kinds: - Caching of images inside one processing run (which I consider the renderer's duty to a certain degree. Of course, the layout engine has to determine the extents of the image before the renderer goes into action) - Caching of images over a longer time and between processing runs. I agree that a specialized SourceResolver is a good thing but I still wonder about the decoding work. I was primarily wondering about the second kind of caching, but the discussion went stringly towards the first kind. Anyway, I'm still somewhat unsure about all this. Maybe we have to set up a new page in Betrand's Wiki to create a little specification for the image caching. This would also help as a discussion base if we have to contact the XSL:FO WG as Oleg suggests. Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Jeremias Maerki wrote: But the SourceResolver approach will only let you cache the binary representation of an image, quite often it still has to be decoded each time it is used, which costs CPU power. Right? I think so. But nevertheless that would be a cool feature. Consider such a real use case: one have image stored in an application jar file. At the moment I think FOP cannot handle such case, but having SourceResolver we can delegate source URI resolving to a user, like URIResolver does and one can easily return us kind of stream, e.g. new InputSource(getClass().getResourceAsStream(/path/to/foo.gif)) -- Oleg Tkachenko eXperanto team Multiconn Technologies, Israel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Jeremias Maerki wrote: - Caching images across renderings But the SourceResolver approach will only let you cache the binary representation of an image, quite often it still has to be decoded each time it is used, which costs CPU power. Right? Right. Next try: provide a layered set of interfaces: - SourceResolver: resolves URI to XMLReader+InputSource. Used for, well, source resolving for graphics, color profiles, fonts, font metrics, perhaps config files, whatever. Can be hooked into for URI mapping, custom protocols, on the fly generation, simple caching. Default implementation similar to the common javax.xml.transform.URIResolver, with a few twists (must peek into the stream to check for XML unless forced by content type). - ImageResolver: resolves a URI+some properties (which?) into a FOPImage. Default implementation uses SourceResolver to get a stream and whether it is an XML stream, detects image type (unless forced by content type). Can be hooked into for advanced image caching (still call the default implementation for doing the image creation). - FontResolver: Same for fonts. - FontMetricsResolver: for completeness, or fold this into the FontResolver. - ColorProfileResolver, : Just to be complete, or use SourceResolver directly. - Fine tuning: A single large image will block a lot of memory during rendering. A possibility is a fox:cache=no control property. In order to preserve semantics, a null image is cached for this URL, and an error is generated in case it is attempted to render the image a second time. So, I may not be so far off the mark after all. Revised thoughts: Two control attributes - tentatively: fox:cache + yes (default): keep the FOPImageObject (for this rendering run) + no: discard it immediately after rendering. Use this to prevent large images which occure only once to take up memory indefinitely. Problem: how should this be handled in static content, markers, table headers/footers with omit-header-at-break=false? Perhaps discard with FO rather than discard after rendering? - tentatively: fox:access + once (default): do not access the source if it has already been accessed, if there is no cached FOPImage, raise an error + use-cached: do not access the source if there is a cached FOPImage, else reload + on-creation: access source while creating this FO unconditionally, replace cached image if there is one. + on-rendering: access source each time this FO is rendered. Don't ask me how this should work together with the resolver stuff above. Perhaps the fox:access stuff is overengineering, don't take it too serious. - Dynamic URLs. I think you've got me wrong what I meant with dynamic URLs. I got it quite right. I should have mentioned I wanted to supply a mechanism which allows the construction of different URIs in case someone wants to use images for page numbers. Maybe we have to set up a new page in Betrand's Wiki to create a little specification for the image caching. This would also help as a discussion base if we have to contact the XSL:FO WG as Oleg suggests. Neat idea. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Oleg Tkachenko wrote: - Fine tuning: A single large image will block a lot of memory during rendering. A possibility is a fox:cache=no control property. In order to preserve semantics, a null image is cached for this URL, and an error is generated in case it is attempted to render the image a second time. I don't get it a little bit, why error should be generated? What's wrong with reloading an image each time it's referenced? Because it breaks the (FOP) specified semantics that the image is always the same in case the source chooses to supply a different image on each access. But see the other post. - Dynamic URLs. In order to achive this, we can extend the functions available in property expressions by concat() and page-number(). This one looks dubious for me. Can we add any new functions to the core library? Can we? Sure. Is it wise to do so? Oh well, get me an asbestos suit quickly! Extension functions in different namespace like we used to in xpath are certanly not allowed in xsl as FunctionName here is NCName in contrast to QName in xpath. One more fault in the spec I think. :( Oops! I didn't notice this. Didn't someone on the XML-DEV list recently mention they prepare for a 2.0? Seems they have to do a lot of home work for this! BTW changing NCName - QName is probably considered an incompatible change, warranting a new release... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Oleg Tkachenko wrote: I think so. But nevertheless that would be a cool feature. Consider such a real use case: one have image stored in an application jar file. At the moment I think FOP cannot handle such case, I didn't try myself, but a jar URI should work. Something like jar:file:///foo/bar.jar#com/experanto/images/logo.gif Sorry, I'm too lazy to look up details. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
[RT] Proprietary extension to fo:external-graphic
While investigating the multi-threading issues in the maint-branch I came across the following: Currently, in the context of the PDF renderer, every FopImage is closed as soon as it's written to the target file. The next time the same image/url is used it has to be reloaded. This is not true for the other renderers, where the images are really being cached. The calls to FopImage.close() in PDFXObject are effectively disabling the caching mechanism. But on the other side it enables the correct working of urls that deliver dynamic content (only for PDFs), when the same URL can deliver different content over multiple invocations. Which brings me to my idea. I don't know if we had that before. Wouldn't it solve this problem if we defined a proprietary extension for fo:external-graphic to specify if a given url is not to be cached? The content-type attribute can obviously not be used for that purpose. How about this? fo:external-graphic src=url(http://localhost/mydynamicimage) xmlns:fop=http://xml.apache.org/fop; fop:disable-caching=true/ Default for disable-caching would be false. This could also be useful for the redesign, where we have the same problem: When can and should we cache an image? Currently, I'm thinking if I should just delete the FopImage.close() method, so the behaviour of image handling is the same for all renderers, but that results in a semantic change for the PDF renderer. To still be able to serve dynamic images the above would be necessary. Any thoughts? Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Jeremias Maerki wrote: Currently, in the context of the PDF renderer, every FopImage is closed as soon as it's written to the target file. The next time the same image/url is used it has to be reloaded. This is not true for the other renderers, where the images are really being cached. The calls to FopImage.close() in PDFXObject are effectively disabling the caching mechanism. But on the other side it enables the correct working of urls that deliver dynamic content (only for PDFs), when the same URL can deliver different content over multiple invocations. If we are talking about one particular formatting invokation scope, I don't think anybody would rely on dynamic image generation trying to place different images (while with the same URI) on a different pages. Acually the spec says nothing about it, but I believe it's up to formatter when/in which order/how much times to load images as it follows general xsl side-effect free policy. It's probably worth to check how other formatters cache images though. -- Oleg Tkachenko eXperanto team Multiconn Technologies, Israel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Thanks for answering. Do you have a pointer to some documentation describing that side-effect free policy? So, do I get you right that the close() calls can safely be removed because the semantic change I described is irrelevant? That would be nice because it's easy to fix. On Wed, 06 Nov 2002 12:33:11 +0200 Oleg Tkachenko wrote: If we are talking about one particular formatting invokation scope, I don't think anybody would rely on dynamic image generation trying to place different images (while with the same URI) on a different pages. Acually the spec says nothing about it, but I believe it's up to formatter when/in which order/how much times to load images as it follows general xsl side-effect free policy. It's probably worth to check how other formatters cache images though. Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
On Wednesday 06 November 2002 09:55, Jeremias Maerki wrote: . . . fo:external-graphic src=url(http://localhost/mydynamicimage) xmlns:fop=http://xml.apache.org/fop; fop:disable-caching=true/ . . . There are some fox: extensions already IIRC (never used them though, but http://xml.apache.org/fop/extensions.html says so), so I think new ones should be created in a consistent way. I'm ok with such extensions (we use similar things in jfor), just would like to make sure that there is only one extension mechanism. -Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Jeremias Maerki wrote: Thanks for answering. Do you have a pointer to some documentation describing that side-effect free policy? Unfortunately not. xsl requirements and xsl proposal states intensions for xsl to be side-effect free language, like its dad dsssl, but as side-effect free xslt is now a separate recommendation, xsl-fo is staying in its shadow. -- Oleg Tkachenko eXperanto team Multiconn Technologies, Israel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
On Wed, 2002-11-06 at 12:01, Bertrand Delacretaz wrote: On Wednesday 06 November 2002 09:55, Jeremias Maerki wrote: . . . fo:external-graphic src=url(http://localhost/mydynamicimage) xmlns:fop=http://xml.apache.org/fop; fop:disable-caching=true/ . . . There are some fox: extensions already IIRC (never used them though, but http://xml.apache.org/fop/extensions.html says so), so I think new ones should be created in a consistent way. That particular extension fox:... is for the pdf bookmarks. (ie. http://xml.apache.org/fop/extensions.pdf the pdf viewer should show the bookmarks) Currently the extension mechanism is only setup for handling xml elements and not for attributes. Anyone can add an extension and I don't really considered it an extension to FO unless you are doing some sort of fo tree/layout/areatree manipulation. I'm ok with such extensions (we use similar things in jfor), just would like to make sure that there is only one extension mechanism. What sort of jfor extensions are there, what do they do? -Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
On Wednesday 06 November 2002 12:31, Keiron Liddle wrote: . . . What sort of jfor extensions are there, what do they do? We have jfor:style to define RTF styles (similar to CSS classes in concept) on the generated RTF elements. A concept that does not exist in XSL-FO as it doesn't make sense when generating printable documents. And also jfor:cmd to set options for the jfor processor, currently used for special tricks/hacks for keep-with stuff. -Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: [RT] Proprietary extension to fo:external-graphic
Oleg Tkachenko wrote: Jeremias Maerki wrote: Thanks for answering. Do you have a pointer to some documentation describing that side-effect free policy? Unfortunately not. xsl requirements and xsl proposal states intensions for xsl to be side-effect free language, like its dad dsssl, but as side-effect free xslt is now a separate recommendation, xsl-fo is staying in its shadow. Well, XSLT explicitely specified that document() must always return the same tree during a transformation run. It does not explicitely say whether the source is only accessed once or multiple times, and, quite predictable, Xalan accesses the referenced URL every time it encounters a document() call (even though it seems to discard the read tree in favor of the cached tree), while Saxon and libxslt access the URL exactly once. I think it would be prudent to follow the same for fo:external-graphics and fo:color-profile, on the ground that FOs may be rendered out of order and, even more important, it is not clear whether multiple renderings of an external graphic in a static content, table header/footer or a marker should result in multiple access to the source. Unfortunately, the spec doesn't even mention this issue. Mind you, there was already a complaint where someone used a fo:external-graphic in a page footer for images representing page numbers and of course didn't get what he expected. In XSLT, for document(), it can be argued that it should be easy to arrange for an additional dummy parameter in order to have distinct URLs, for example xsl:value-of select=document('http://ts.com/get-time.cgi?start')/ xsl:call-template name=template-to-profile/ xsl:value-of select=document('http://ts.com/get-time.cgi?end')/ Of course, nothing prevents the XSLT processor from fetching both values first and then going on with evaluating the template in between, therefore this technique is risky at least. A similar approach oviously wont work for fetching graphics repressenting page numbers. Conclusions and ideas so far: - FOP should cache external graphics during a rendering and by default clear the cache afterwards. - Caching images across renderings definitely is an issue too (think of the company logo in each page header in every document), but FOP shouldn't solve this. I imagine a SourceResolver interface which gets an URL and optional content type and returns a XMLReader/InputSource pair. In case of binary image formats the default implementation returns a null parser. People who want to cache images across renderings can implement their own resolver which can do the caching. The Cocoon crowd will certainly rejoice (no more memory leaks due to FOP caching, access to Cocoon caching and Cocoon internal pipelines and other advantages). - Fine tuning: A single large image will block a lot of memory during rendering. A possibility is a fox:cache=no control property. In order to preserve semantics, a null image is cached for this URL, and an error is generated in case it is attempted to render the image a second time. - Dynamic URLs. In order to achive this, we can extend the functions available in property expressions by concat() and page-number(). I believe both would be welcome by many users for other purposes too (whether that't a good idea is another matter). One of the possible concerns are usually name clashes with future XSLFO extensions. Using prefixed identifiers like fox:concat() would be a solution, I'm somewhat uneasy with using XML namespace mechanisms within values for XML attributes. In fact, I think its abuse, but I can't offer much better ideas either. Regards J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]