[
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200369#comment-17200369
]
Gary Lucas commented on IMAGING-266:
------------------------------------
I researched this feature request and what I've found is basically a mix of
good news and bad news. The good news is that the ImageBuilder API can be used
as a way of storing the integer data. I looked into the idea of just storing
the integer data in the BufferedImage output from the file, but it had too much
of an improvisational flavor for my liking. It would also mean performing an
extra transcription of the data. During processing, the ImageBuilder is used as
a temporary container for pixel information while it is being extracted from
the TIFF file and collected for output. So my plan is to just provide an API
element that exposes the ImageBuilder instance to the application. This
addition can be accomplished with minimal changes to the existing code base.
The bad news is that the data reader classes will need an enhancement. The
original authors of Commons Imaging made a design choice which, in retrospect,
was unlucky. The TIFF specification allows data, particularly grayscale data,
to use a variable number of bits to encode an image. The elevation data that
was the inspiration for this feature request is based on 16 bits per sample.
But the Commons Imaging data readers always convert samples to single bytes.
Most of the time, this doesn't matter. In some case, such as RGB images, the
data is already in the form of one byte per sample (3 samples per pixel).
Even though it is common to think of RGB values as consisting of three bytes
(three "samples", one for each color), there is nothing fundamental about the
specification of 8 bits per sample. For example, the current GOES-R generation
of weather satellites use 12-bit imaging channels to give better discrimination
of cloud and ground radiance values. Older satellite images frequently used 10
bits. However, to simplify the code for its various photometric interpreters
(the classes that map binary data to pixel colors for rendering images), the
original implementation built in an adjustment that always converts sample
values to bytes before passing them on to the photometric interpreters. If you
look at the DataReaderStrips and DataReaderTiles classes, you'll see methods
called getSamplesAsBytes() that do this operation.
For elevation products, this conversion has the consequence if throwing away
most of the meaningful information in the source data.
Anyway, addressing this issue requires one of two approaches. One idea is to
get rid of the sample conversion and upgrade the photometric interpreters to
handle the data correctly. But this change would mean changing multiple
photometric interpreters, some of which (CieLab, LogLuv, YcbCR) are quite
complicated. The alternative is to implement a special block of code and a
special processing rule. This is the approach I propose to implement.
I propose the following: The Commons Imaging API permits an application to pass
in a custom photometric interpreter. The new feature will implement a special
processing rule so that, when the application passes in a customer photometric
interpreter, the data readers will retain the full precision for the samples.
Eight bit samples will stay 8 bits. Four bit samples will stay 4. And the 16
bit samples in the elevation products will stay 16. Presumably, an application
that takes the trouble to supply a custom photometric interpreter will want to
handle the data exactly as it appears in its source TIFF files. And, since the
ability to specify custom photometric interpreters is a relatively new feature
(only a few months old), it is unlikely that this change will interfere with
any existing code.
Please let me know if you have any suggestions or insights that I may have
missed.
Thanks
> Read integer data from GeoTIFFS
> --------------------------------
>
> Key: IMAGING-266
> URL: https://issues.apache.org/jira/browse/IMAGING-266
> Project: Commons Imaging
> Issue Type: New Feature
> Components: Format: TIFF
> Affects Versions: 1.0-alpha3
> Reporter: Gary Lucas
> Priority: Major
>
> I recently discovered that there is a large amount of digital elevation data
> available in the form of 16-bit integer coded data in GeoTIFF files (TIFF
> files with geographic tags). I propose to enhance the Commons Imaging API to
> read these files. This work will be similar to the work I did for reading
> floating-point raster data under ISSUE-251.
> Available data include the nearly-global coverage of one-second of arc
> elevation data produced from the Shuttle Radar Topography Mission (SRTM) and
> other sources. These products give grids of elevation data with a 30 meter
> cell spacing for most of the world's land masses. They are available at NASA
> Earthdata and Japan Space Systems websites, see
> [https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp]
> There is also a ocean bathymetry data set available in this format at
> [http://www.shadedrelief.com/blue-earth/]
> This new feature will continue to expand the usefulness of the Commons
> Imaging API in accessing GeoTIFF products.
> Request for Feedback
> So far, the data products I've found (ASTER and Blue Earth Bathymetry) give
> elevation and ocean depth data in meters recorded as a short integer. I
> haven't found an example of where the 32-bit integer format is used. For
> now, I am planning on only coding the 16-bit integer variation. Does anyone
> know if the 32-bit version is worth supporting? My criteria for determining
> that would be based on whether there is a significant number of projects
> using that format (life is too short to chase rarely used data formats).
> Currently, one of the code-analysis operations conducted by the Commons
> Imaging build process is coverage by JUnit tests. Lacking any test data for
> the 32-bit case, I am reluctant to include it in the code base because it
> would mean putting uncovered code into the distribution.
> Also, I am wondering about the best design for the access API. The current
> TiffImageParser class has a method called getFloatingPointRasterData() that
> returns an instance of TiffRasterData. TiffRasterData is pretty much
> hard-wired to floating point data. I am thinking of creating a new method
> called getIntegerRasterData() that would return an instance of a new class
> called TiffIntegerRasterData. Does this seem reasonable? I considered trying
> to combine both kinds of results into a unified class and method, but it
> seems more unwieldy than useful.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)