Re: Initial feedback for std.experimental.image

Rikki Cattermole via Digitalmars-d Thu, 09 Jul 2015 08:10:40 -0700

On 10/07/2015 2:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?=<[email protected]>" wrote:

On Thursday, 9 July 2015 at 04:09:11 UTC, Rikki Cattermole wrote:

On 9/07/2015 6:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?=
<[email protected]>" wrote:

On Monday, 6 July 2015 at 13:48:53 UTC, Rikki Cattermole wrote:


Please destroy!


You asked for it! :)

As a reference to a library that is used to handle images on a
professional level (VFX industry), I'd encourage you to look at the
feature set and interfaces of OpenImageIO. Sure, it's a big library and
some of it is definitely out of scope for what you try to accomplish
(image tile caching and texture sampling, obviously).

Yet, there are some features I specifically want to mention here to
challenge the scope of your design:

- arbitrary channel layouts in images: this is a big one. You mention 3D
engines as a targeted use case in the specification. 3D rendering is one
of the worst offenders when it comes to crazy channel layouts in
textures (which are obviously stored as image files). If you have a data
texture that requires 2 channels (e.g. uv offsets for texture lookups in
shaders or some crazy data tables), its memory layout should also only
ever have two channels. Don't expand it to RGB transparently or anything
else braindead. Don't change the data type of the pixel values wildly
without being asked to do so. The developer most likely has chosen a 16
bit signed integer per channel (or whatever else) for a good reason.
Some high end file formats like OpenEXR even allow users to store
completely arbitrary channels as well, often with a different
per-channel data format (leading to layouts like RGBAZ with an
additional mask channel on top). But support for that really bloats
image library interfaces. I'd stick with a sane variant of the
uncompressed texture formats that the OpenGL specification lists as the
target set of supported in-memory image formats. That mostly matches
current GPU hardware support and probably will for some time to come.


As long as the color implementation matches isColor from
std.experimental.color. Then it's a color. I do not handle that :)
The rest of how it maps in memory is defined by the image storage
types. Any image loader/exporter can use any as long as you specify it
via a template argument *currently*.


Hm... in that case you introduce transparent mappings between
user-facing types and the internal mapping which may be lossy in various
ways. This works, but the internal type should be discoverable somehow.
This leads down a similar road to OpenGL texture formats: they have
internal storage formats and there's the host formats to/from which the
data is converted when passing back and forth. This adds a lot of
complexity and potential for surprises, unfortunately. I'm not entirely
sure what to think here.


Internal color to an image storage type is well known at compile time.

Now SwappableImage that wraps another image type. That definitelymuddies the water a lot. Since it auto converts from the originalformat. Which could be, you know messy.It's actually the main reason I asked Manu for a gain/loss precisionfunctions. For detecting if precision is being changed. Mostly forlogging purposes.

- window regions: now this not quite your average image format feature,
but relevant for some use cases. The gist of it is that the image file
may define a coordinate system for a whole image frame but only contain
actual data within certain regions that do not cover the whole frame.
These regions may even extend beyond the defined image frame (used e.g.
for VFX image postprocessing to have properly defined pixel values to
filter into the visible part of the final frame). Again, the OpenEXR
documentation explains this feature nicely. Again, I think this likely
is out of scope for this library.


Ugh based upon what you said, that is out of scope of the image
loader/exporters that I'm writing. Also right now it is only using
unsigned integers for coordinates. I'm guessing if it is outside the
bounds it can go negative then.
Slightly too specialized for what we need in the general case.


Yes, this is a slightly special use case. I can think of quite a lot of
cases where you would want border regions of some kind for what you are
doing, but they are all related to rendering and image processing.

You have convinced me that I need to add a subimage struct which isbasically SwappableImage. Just with offset/size different to original.

My first point also leads me to this criticism:

- I do not see a way to discover the actual data format of a PNG file
through your loader. Is it 8 bit palette-based, 8 bit per pixel or 16
bits per pixel? Especially the latter should not be transparently
converted to 8 bits per pixel if encountered because it is a lossy
transformation. As I see it right now you have to know the pixel format
up front to instantiate the loader. I consider that bad design. You can
only have true knowledge of the file contents after the image header
were parsed. The same is generally true of most actually useful image
formats out there.


The reasoning is because this is what I know I can work with. You
specify what you want to use, it'll auto convert after that. It makes
user code a bit simpler.


I can understand your reasoning and this is why libraries like FreeImage
make it very simple to get the image data converted to the format you
want from an arbitrary input. What I'd like to see is more of an
extension of the current mechanism: make it possible to query the data
format of the image file. That way, the application can make a wiser
decision on the format in which it wants to receive the data, but it
always is able to get the data in a format it understands. The format
description for the file format would have to be quite complex to cover
all possibilities, though. The best that I can come up with is a list of
tuples of channel names (as strings) and data type (as enums).
Processing those isn't fun, though.

The problem here is simple. You must know what color type you are goingto be working with. There is no guessing. If you want to change to matchthe file loader better, you'll have to load it twice and then you haveto understand the file format internals a bit more.

This is kinda where it gets messy.

But, would it be better if you could just parse the headers? So itdoesn't initialize the image data. I doubt it would be all that hard.It's just disabling a series of features.

- Could support for image data alignment be added by defining a new
ImageStorage subclass? The actual in-memory data is not exposed to
direct access, is it? Access to the raw image data would be preferable
for those cases where you know exactly what you are doing. Going through
per-pixel access functions for large image regions is going to be
dreadfully slow in comparison to what can be achieved with proper
processing/filtering code.


I ugh... had this feature once. I removed it as if you already know
the implementation why not just directly access it?
But, if there is genuine need to get access to it as e.g. void* then I
can do it again.

- Also, uploading textures to the GPU requires passing raw memory blocks
and a format description of sorts to the 3D API. Being required to
slowly copy the image data in question into a temporary buffer for this
process is not an adequate solution.


Again for previous answer, was possible. No matter what the image
storage type was. But it was hairy and could and would cause bugs in
the future. Your probably better off knowing the type and getting
access directly to it that way.


This is where the abstraction of ImageStorage with several possible
implementations becomes iffy. The user is at the loader's mercy to
hopefully hand over the right implementation type. I'm not sure I like
that idea. This seems inconsistent with making the pixel format the
user's choice.  Why should the user have choice over one thing and not
the other?

If the image loader uses another image storage type then it is missbehaving. There is no excuse for it.

Anyway the main thing about this to understand is that if the imageloader does not initialize, then it would have to resize and since notall image storage types have to support resizing...


Some very good points that I believe definitely needs to be touched
upon where had.

I've had a read of OpenImageIO documentation and all I can say is irkkk.
Most of what is in there with e.g. tiles and reflection styles methods
are out of scope out right as they are a bit specialized for my
liking. If somebody wants to add it, take a look at the offset
support. It was written as an 'extension' like ForwardRange is for
ranges.


I mentioned OpenImageIO as this library is full-featured and very
complete in a lot of areas. It shows what it takes to be as flexible as
possible regarding the image data that is processed. Take it as a
catalog of things to consider, but not as template.

The purpose of this library is to work more for GUI's and games then
anything else. It is meant more for the average programmer then
specialized in imagery ones. It's kinda why I wrote the specification
document. Just so in the future if somebody comes in saying its awful
who does know and use these kinds of libraries. Will have to
understand that it was out of scope and was done on purpose.


Having a specification is a good thing and this is why I entered the
discussion. Although your specification is still a bit vague in my
opinion, the general direction is good. The limitation of the scope
looks fine to me and I'm not arguing against that. My point is rather
that your design can still be improved to match that scope better.


Yeah indeed. Any tips for specification document improvement?
I would love to make it standard for Phobos additions like this.

Re: Initial feedback for std.experimental.image

Reply via email to