Re: Thoughts about image handling

Jeremias Maerki Mon, 19 Jun 2006 08:43:40 -0700

On 19.06.2006 16:52:25 Max Berger wrote:
> Dear Fop developers,
> 
> after a while and some thinking, here is my concept for extensible  
> image handlers for fop, or even better for xmlgraphics. If desired, I  
> can implement this concept for xmlgraphics / fop with support for  
> imageio, jimi and batik.
> 
> Image handling is  a three-step process. Step one is detecting the  
> file format, step two is loading the image, step three is outputting  
> the image into whatever the output renderer is.


Actually, you left out the pre-loading of the image size. That's
important if you want to delay image loading until the rendering stage 
(or avoid it altogether). Note that not in all cases will it be
necessary to load an image. Sometimes only references to the images are
put in the output format. Currently, this only applies to RTF but could
actually be used in PostScript and maybe AFP output. Special languages
such as PPML even go so far an make it their prime purpose not having to
handle all the image data but providing them on the targte platform.

> 
> Step 1: Detecting the file format.
> 
> In most handlers this currently happens by trying to load the files,  
> which is very inefficient. Instead, detectors like jmimemagic should  
> be used. These should return a mime-type or null if the type can not  
> be detected.
> 
> To speed up the process, a mime-type may be guessed from the file  
> extension, and added as "hint"
> 
> Example interface:
> 
> public interface MimeTypeDetector {
>    String detectMimeType(URL file, String probableType) throws  
> IOException;
> }

Ok, so far I don't see any advantage over the current code.

> Step 2: Loading the image.
> 
> The image can then be loaded by any image handler that supports this  
> type.
> 
> Example:
> 
> public interface ImageHandler {
>    void setImage(URL file) throws IOException;
>    // ...
> }

In your scenario, when would the image actually be loaded?

> Step 3:  Outputting the image.
> 
> Generally there are three types of images. Vector images (SVG, MML),  
> bitmap images (GIF, BMP, JPEG), and uninterpreted images (EPS)
> 
> - Vector images must supply a paint(Graphics g) function
> - Bitmap images must supply a method to get the image contents as a  
> Raster (similar to java.awt.RenderedImage)
> - Bitmap images may supply a method to get the image contents in LZW,  
> ZLIB, DCT, format, such as in JPEG or TIFF compression (used in PDF)
> - Uninterpreted images just provide a method to get the contents in  
> its original format.

Going towards Raster or RenderedImage for the in-memory representation
of the image is certainly a very welcome step.

What I'm missing a little is that certain images will be converted
before they are processed by the renderer. For example, Barcode4J
converts its barcodes to SVG, EPS or Java2D graphics depending on the
output format in use. Generally, each renderer will have different
preferences how an image will be processed. While PDF can embed TIFF
CCITT4 files directly, they have to be decoded for PCL. The ideal image
subsystem will also cache a preconverted image so the
conversion/decoding can be avoided next time the image is used.

> Reflection may be used in the renderer to find out the image type,  
> rather than checking for the type. Example:
> 
> if (image instanceOf VectorImage)  {
>    ((VectorImage)image).paint(graphics)
> } else if (image instanceOf DCTEncodedImage) {
>    addResource(image.getDCTData)
> // ...

I don't think that'll work considering the above. I rather think the
Renderer will have to tell the image subsystem the preferred flavor of
the image. It will then receive the image in the right form if that is
possible.

> 
> To support extensibility, a registration mechanism is provided. Here  
> is the basic idea:
> 
> Java provides standard mechanisms to find all resources with a given  
> name in all classpath items. This allows to find all META-INF/ 
> MANIFEST.MF files given in all JAR files in the classpath (1). These  
> files can be parsed using standard Manifest functionality.
> 
> The files contain some attributes that describe classes used. For  
> image handlers, this could be a classname and the supported image  
> type. It may contain additional attributes, such as supported  
> subtypes (e.g. LZW for TIFF). Ideally the exact specification of  
> these attributes would be coordinated between fop and foray to  
> support reuse.
> 
> This information can be parsed once and stored.
> 
> This mechanism requires the user to change only the classpath, and  
> nothing else.

Ok, something like that sounds pretty good. Remains to be seen whether
the config needs to be in a file or rather in a factory class like we've
done it before (example: AbstractRendererMaker). The only thing left
might be the question how to handle priorities if two implementations
support the same kind of image.

> I have written a short proof-of concept code for the registration,  
> available at
>    http://max.berger.name/tmp/extTestMain.jar
> and
>    http://max.berger.name/tmp/extTestProvider.jar
> 
> (source is included in the jar files).
> 
> To run, try:
>    java -cp extTestMain.jar name.berger.max.test.ext.Main
> or
>   java -cp extTestMain.jar:extTestProvider.jar  
> name.berger.max.test.ext.Main
> 
> (1) Of course, it doesn't have to be MANIFEST.MF.  For resources such  
> as fonts this may well be META-INF/FontDescriptor.xml or something else.

What you describe here is already in use in FOP and Batik. We don't use
the MANIFEST.MF directly but the class name of the provider
class/interface. See [1] and [2]. Let's reuse what we already have.

[1] 
http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/META-INF/services/
[2] 
http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/util/Service.java?view=log

> questions? comments?

Have you seen this Wiki page? 
http://wiki.apache.org/xmlgraphics-fop/ImageSupport

It's there for gathering all the requirement on the image library. I'll
bring it up-to-date in a minute. I saw there are a few things I need to
change.

I'm happy to see that you volunteer to work in this area. It's something
I wanted to fix for a long time now but it always had a lower priority. I
envisioned a slightly different direction as you can guess from my
comments but this is still open for discussion.

Jeremias Maerki

Re: Thoughts about image handling

Reply via email to