[Openexr-devel] RAW images in OpenEXR?

Florian Kainz Tue, 06 May 2008 13:20:06 -0700


Recently several people have asked whether OpenEXR would be suitable
for storing RAW images from cameras with color filter array sensors.
The proposal below describes a method to do that.  I would be interested
in feedback from OpenEXR users.


Florian


OpenEXR RAW Images
------------------

CFA Image Sensors And RAW Images

    Digital image file formats such as OpenEXR or JPEG usually represent
    images as red-green-blue (RGB) data.  Conceptually, each pixel in an
    image file has a red, a green and a blue value.  Image files may be
    compressed, and compression often involves transforming the RGB
    pixels to an alternate format before the data are stored in a file,
    but the original RGB data can be recovered from the file - at least
    approximately - by reversing this transformation.

    The image sensors in most modern electronic cameras do not record
    full RGB data for every pixel.  Cameras typically use sensors that
    are equipped with color filter arrays.  Each pixel in such a sensor
    is covered with a red, green or blue color filter.  The filters are
    arranged in a regular pattern, for example, like this:

        G R G R G R
        B G B G B G
        G R G R G R
        B G B G B G
        G R G R G R
        B G B G B G

    To reconstruct a full-color picture from an image that has been
    recorded by such a color filter array sensor (CFA sensor), the
    image is first split into a red, a green and a blue channel:

        . R . R . R    G . G . G .    . . . . . .
        . . . . . .    . G . G . G    B . B . B .
        . R . R . R    G . G . G .    . . . . . .
        . . . . . .    . G . G . G    B . B . B .
        . R . R . R    G . G . G .    . . . . . .
        . . . . . .    . G . G . G    B . B . B .

    Some of the pixels in each channel contain no data (indicated
    by a period).  Before combining the red, green and blue channels
    into a an RGB image, values for the empty pixels in each channel
    must be interpolated from neighboring pixels that do contain data.

    Not all CFA sensors use red, green and blue filters.  For example,
    some cameras use green, magenta, yellow and cyan filters:

        G Y G Y G Y
        C M C M C M
        G Y G Y G Y
        C M C M C M
        G Y G Y G Y
        C M C M C M

    In another variation, the pixel grid in some image sensors is
    rotated 45 degrees with respect to the edges of the image:

         G G G G G
        B R B R B R
         G G G G G
        R B R B R B
         G G G G G
        B R B R B R

    Most electronic cameras automatically convert raw CFA sensor data to
    RGB images.  The camera outputs RGB images and discards the raw data.
    However, some users prefer to use their cameras in "raw mode," where
    the camera directly outputs the more ore less unaltered CFA sensor
    data.  Reconstruction of RGB images is deferred to an offline process.
    Saving raw data can be desirable for two reasons:

    - An offline process that does not have to work in real time and
      within the often limited computing resources available in the
      camera may be able to reconstruct better looking RGB images.

    - Since raw sensor data contain only one value per pixel instead of
      three, a raw image occupies only a third as much space as an RGB
      image with the same bit depth and compression.

    Image files that contain raw CFA sensor data are often called
    "RAW files" or "camera RAW files."

Storing RAW Images in OpenEXR Files

    It would be possible to store the output of a CFA image sensor
    directly in a single-channel OpenEXR image file.  Additional
    information such as the colors and locations of the filters
    could be stored in an attribute in the file header.  The need
    for image compression makes this approach undesirable.  Every
    pixel in such a single-channel image is surrounded by pixels
    with different color filters.  Existing compression methods in
    OpenEXR are not aware of this interleaving of image channels.
    Lossy compression methods (B44, B44A) would introduce crosstalk
    between the channels.  Lossless compression methods (PIZ, ZIP)
    would preserve the image exactly, but the compression rate
    would suffer.

    Another way to store raw CFA sensor data is to split the image
    into multiple channels with one channel per filter color.
    OpenEXR's sub-sampled image channels provide an efficient way to
    represent the resulting sparsely populated channels.  Since each
    filter color is stored in its own channel, existing compression
    methods work well.  Lossy compression does not introduce crosstalk
    between filter colors, and lossless compression achieve nearly
    the same compression rates as for regular RGB images.

    Every channel in an OpenEXR image has an x and a y sampling rate.
    A channel contains data only for pixel locations whose x and y
    coordinates are evenly divisible by the x and y sampling rates:

        (x % xSampling == 0)  && (y % ySampling == 0)

    For a CFA image sensor with RGB filters, we use the following
    sampling rates:

        channel     xSampling   ySampling

        R           2           2
        G           2           1
        B           2           2

    Now our OpenEXR file contains one R, two G and one B sample for
    every four pixels, just as in the sensor.  However, the spatial
    arrangement of the samples differs:

        sensor                      file

        G   R   G   R   G   R       RGB .   RGB .   RGB .
        B   G   B   G   B   G       G   .   G   .   G   .
        G   R   G   R   G   R       RGB .   RGB .   RGB .
        B   G   B   G   B   G       G   .   G   .   G   .
        G   R   G   R   G   R       RGB .   RGB .   RGB .
        B   G   B   G   B   G       G   .   G   .   G   .

    We must augment the file by describing the arrangement of the
    pixels in the sensor.

    The color filters in front of the pixels in the sensor are arranged
    in a regular pattern; the sensor is covered with repetitions of a
    two-by-two pixel tile:

        G R
        B G

    We can describe this pattern by adding a new CfaTile attribute to
    the OpenEXR file header:

        struct CfaPixel
        {
            string  channelName;
            int     xOffset;
            int     yOffset;
            V3f     XYZ;
        };

        class CfaTile
        {
          public:

            int                 xSize () const;
            int                 ySize () const;
            const CfaPixel &    pixel (int x, int y) const;
            CfaPixel &          pixel (int x, int y);

            ...
        };

    A CfaPixel, p, at location (x, y) in CfaTile t defines the
    following:

      * Channel p.channelName in the OpenEXR file has values for
        all pixels whose coordinates (px, py) are of the form

            px = x + n * t.xSize
            py = y + m * t.ySize

        In the file, the value for pixel (px, py) is stored at
        location

            (px + p.xOffset, py + p.offset)

      * p.XYZ is a set of weights for reconstructing CIE XYZ colors
        from the CFA sensor data.  After all channels have been fully
        populated by interpolation, the XYZ color of each pixel
        computed as a weighted sum of all the channels:

            XYZpixels[py][px] = V3f (0, 0, 0);

            for (...)
                XYZpixels[py][px] += channel(p.channelName)[py][px] * p.XYZ;

        Once the XYZ color of a pixel is known, the color can be
        converted to any desired RGB space.

      * As a special case, if p.channelName is an empty string, then
        the file contains no data for this pixel.

    For example, the two-by-two-pixel CfaTile for our RGB CFA sensor
    would look like this:

        x   y   channelName xOffset yOffset XYZ

        0   0    G           0       0      (0.3576, 0.7152, 0.1192)
        1   0    R          -1       0      (0.4124, 0.2126, 0.0193)
        0   1    B           0      -1      (0.1805, 0.0722, 0.9505)
        1   1    G          -1       0      (0.3576, 0.7152, 0.1192)

    Using sub-sampled channels and a CfaTile attribute, we can also
    handle sensors with green, magenta, yellow and cyan filters:

        sensor                      file

        G   Y   G   Y   G   Y       GYCM .    GYCM .    GYCM .
        C   M   C   M   C   M       .    .    .    .    .    .
        G   Y   G   Y   G   Y       GYCM .    GYCM .    GYCM .
        C   M   C   M   C   M       .    .    .    .    .    .
        G   Y   G   Y   G   Y       GYCM .    GYCM .    GYCM .
        C   M   C   M   C   M       .    .    .    .    .    .

        channels

            name    xSampling   ySampling
            G       2           2
            Y       2           2
            C       2           2
            M       2           2

        CfaTile (2x2)

            x   y   channelName xOffset yOffset XYZ

            0   0    G           0       0      (...)
            1   0    Y          -1       0      (...)
            0   1    C           0      -1      (...)
            1   1    M          -1      -1      (...)

    The same representation can also handle sensor pixel grids that
    are rotated by 45 degrees:

        sensor          file

         G G G G G      RGB .   G   RGB .   G   .
        B R B R B R     .   .   .   .   .   .   .
         G G G G G      RGB .   G   RGB .   G   .
        R B R B R B     .   .   .   .   .   .   .
         G G G G G      RGB .   G   RGB .   G   .
        B R B R B R     .   .   .   .   .   .   .

        channels

            name     xSampling   ySampling

            R        4           2
            G        2           2
            B        4           2

        CfaTile (4x4)

            x   y   channelName xOffset yOffset XYZ

            0   0   (empty)     -1       0      (...)
            1   0    G
            2   0   (empty)
            3   0    G          -1       0      (...)

            0   1    B           0      -1      (...)
            1   1   (empty)
            2   1    R          -2      -1      (...)
            3   1   (empty)

            0   2   (empty)
            1   2    G          -1       0      (...)
            2   2   (empty)
            3   2    G          -1       0      (...)

            0   3    R           0      -1      (...)
            1   3   (empty)
            2   3    B          -2      -1      (...)
            3   3   (empty)

    In this last case both the OpenEXR image channels and the CfaTile
    pixel grid are rather sparsely populated.  The corresponding
    interpolated RGB image will have a rather high resolution, but
    it will not contain fine detail.  The interpolated image should
    probably be scaled down, either by a factor of sqrt(2) (resulting
    in the same number of R, G and B sensor samples per RGB pixel as
    for a non-rotated grid) or by a factor of 2 (resulting in one
    green sample per RGB pixel).  This scale factor should perhaps
    be included in the CfaTile attribute.

Integer or Floating-Point?

    Representing raw CFA sensor data with sub-sampled channels and
    a CfaTile attribute would work with either floating-point or
    integer channels.  With floating-point channels, the pixel data
    would probably be scaled such that middle gray falls somewhere
    close to 0.18.  With integer channels, middle gray might be
    represented as a value close to 9% of the maximum, for example,
    1475 for a sensor that outputs 14-bit data with a maximum of
    16383 (effectively mapping the maximum value to 2.0).

    The XYZ scale factors of the CfaPixels would compensate for the
    different scale factors of floating-point versus integer pixel
    data.

    Integers would be "more raw" than floating-point numbers; the
    pixels could represent the exact bit patterns produced by the
    analog-to-digital converter in the camera's sensor system.

    16-bit floating-point numbers would introduce a mild form of
    lossy data compression.  With 14-bit sensor output, numbers
    close to the maximum (16383) have a relative quantization step
    of about 0.006% while the quantization step of 16-bit floating-
    point numbers is 0.1%, so the conversion to floating-point is
    not lossless.  Since raw integer sensor data are nearly linear
    relative to the number of photons captured by the sensor, small
    differences between integer values near the high end of the
    range are not significant for real-world image processing.
    The difference between 15000 and 15001 is completely invisible,
    as is the difference between 15000 and 15020.  Conversion to
    floating-point does not affect image quality, but it does
    result in smaller file sizes because most of the compression
    algorithms in OpenEXR work best with 16-bit floating-point data.
    (PIZ and PXR24 do work reasonably well even with integer pixels.)

Proof-of-Concept Implementation

    The attached tar bundle contains C++ source code for an
    implementation of the CfaTile attribute, and for a command-line
    program that converts an RGB image into a simulated OpenEXR raw
    RGB CFA sensor image.  The program can also convert raw CFA sensor
    images back to RGB.

What's Missing?

    The interpolation algorithm in the attached C++ code is a quick
    hack.  It produces rather soft images and it suffers from edge
    artifacts.  A production-ready implementation of the proposed
    raw image representation would need a much better interpolator.

    The proof-of-concept implementation lacks white balancing, flare
    suppression and other basic color correction.  White balancing
    could be achieved by tweaking the XYZ weights in the CfaPixels,
    but additional header attributes are needed to transmit other
    color correction data.  A CTL program would be a compact and
    very general way to represent this information.

    The OpenEXR library should probably contain some form of support
    for raw-to-RGB conversion.  Ideally the RGBA interface would
    transparently perform this conversion during file reading.

    It is unlikely that a purely software based raw-to-RGB conversion
    would be fast enough to allow reading of OpenEXR raw images at
    high frame rates.  Real-time playback software would probably have
    to upload the raw data to into a graphics card and perform conversion
    to RGB in a GPU-based pixel shader, similar to how playexr handles
    luminance/chroma images.

    And of course, camera manufacturers will have to agree to output
    OpenEXR raw files.

exrraw.tar.gz
Description: GNU Zip compressed data

_______________________________________________
Openexr-devel mailing list
Openexr-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/openexr-devel

[Openexr-devel] RAW images in OpenEXR?

Reply via email to