On Monday, 6 July 2015 at 13:48:53 UTC, Rikki Cattermole wrote:

Please destroy!


You asked for it! :)

As a reference to a library that is used to handle images on a professional level (VFX industry), I'd encourage you to look at the feature set and interfaces of OpenImageIO. Sure, it's a big library and some of it is definitely out of scope for what you try to accomplish (image tile caching and texture sampling, obviously).

Yet, there are some features I specifically want to mention here to challenge the scope of your design:

- arbitrary channel layouts in images: this is a big one. You mention 3D engines as a targeted use case in the specification. 3D rendering is one of the worst offenders when it comes to crazy channel layouts in textures (which are obviously stored as image files). If you have a data texture that requires 2 channels (e.g. uv offsets for texture lookups in shaders or some crazy data tables), its memory layout should also only ever have two channels. Don't expand it to RGB transparently or anything else braindead. Don't change the data type of the pixel values wildly without being asked to do so. The developer most likely has chosen a 16 bit signed integer per channel (or whatever else) for a good reason. Some high end file formats like OpenEXR even allow users to store completely arbitrary channels as well, often with a different per-channel data format (leading to layouts like RGBAZ with an additional mask channel on top). But support for that really bloats image library interfaces. I'd stick with a sane variant of the uncompressed texture formats that the OpenGL specification lists as the target set of supported in-memory image formats. That mostly matches current GPU hardware support and probably will for some time to come.

- padding and memory alignment: depending on the platform, image format and task at hand you may want the in-memory layout of your image to be padded in various ways. For example, you would want your scanlines and pixel values aligned to certain offsets to make use of SIMD instructions which often carry alignment restrictions with them. This is one of the reasons why RGB images are sometimes expanded to have a dummy channel between the triplets. Also, aligning the start of each scanline may be important, which introduces a "pitch" between them that is greater than just the storage size of each scanline by itself. Again, this may help speeding up image processing.

- subimages: this one may seem obscure, but it happens in a number common of file formats (gif, mng, DDS, probably TIFF and others). Subimages can be - for instance - individual animation frames or precomputed mipmaps. This means that they may have metadata attached to them (e.g. framerate or delay to next frame) or they may come in totally different dimensions (mipmap levels).

- window regions: now this not quite your average image format feature, but relevant for some use cases. The gist of it is that the image file may define a coordinate system for a whole image frame but only contain actual data within certain regions that do not cover the whole frame. These regions may even extend beyond the defined image frame (used e.g. for VFX image postprocessing to have properly defined pixel values to filter into the visible part of the final frame). Again, the OpenEXR documentation explains this feature nicely. Again, I think this likely is out of scope for this library.

My first point also leads me to this criticism:

- I do not see a way to discover the actual data format of a PNG file through your loader. Is it 8 bit palette-based, 8 bit per pixel or 16 bits per pixel? Especially the latter should not be transparently converted to 8 bits per pixel if encountered because it is a lossy transformation. As I see it right now you have to know the pixel format up front to instantiate the loader. I consider that bad design. You can only have true knowledge of the file contents after the image header were parsed. The same is generally true of most actually useful image formats out there.

- Could support for image data alignment be added by defining a new ImageStorage subclass? The actual in-memory data is not exposed to direct access, is it? Access to the raw image data would be preferable for those cases where you know exactly what you are doing. Going through per-pixel access functions for large image regions is going to be dreadfully slow in comparison to what can be achieved with proper processing/filtering code.

- Also, uploading textures to the GPU requires passing raw memory blocks and a format description of sorts to the 3D API. Being required to slowly copy the image data in question into a temporary buffer for this process is not an adequate solution.

Let me know what you think!

Reply via email to