On Saturday, 22 March 2014 at 11:25:05 UTC, Phil wrote:
This is very cool. What are the performance implications of
treating colour images as arrays of tuples rather than a flat
array? For example, if I wanted to iterate through every
channel of every pixel in an RGB image or modify the R channel
of every pixel, could I generally expect the compiler to
optimise the extra overhead away? Also, do you have any ideas
on how you could vectorise code like this while still providing
a nice API?
One might say that this approach has the innate benefit that the
loop (to iterate over each channel) will be unrolled explicitly :)
However, if you need to perform operations on individual
channels, it would probably be worthwhile to unpack a
multi-channel image into several images with just one channel.
I'm not familiar enough with vector instruction sets of current
CPUs to answer this confidently. E.g. if there exists an integer
vector multiply-and-add operation, then that could be used for
fast software alpha blending. That operation's restrictions would
dictate the optimal memory layout of the image. E.g. if the
operation requires that the bytes to multiply and add are
contiguous in memory, then it follows that the image should be
represented with each channel as a separate sub-image.