[jira] [Commented] (IMAGING-266) Read integer data from GeoTIFFS

Gary Lucas (Jira) Tue, 22 Sep 2020 13:47:12 -0700


    [ 
https://issues.apache.org/jira/browse/IMAGING-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200369#comment-17200369
 ]


Gary Lucas commented on IMAGING-266:
------------------------------------

I researched this feature request and what I've found is basically a mix of 
good news and bad news. The good news is that the ImageBuilder API can be used 
as a way of storing the integer data.  I looked into the idea of just storing 
the integer data in the BufferedImage output from the file, but it had too much 
of an improvisational flavor for my liking.  It would also mean performing an 
extra transcription of the data. During processing, the ImageBuilder is used as 
a temporary container for pixel information while it is being extracted from 
the TIFF file and collected for output. So my plan is to just provide an API 
element that exposes the ImageBuilder instance to the application.  This 
addition can be accomplished with minimal changes to the existing code base.

The bad news is that the data reader classes will need an enhancement. The 
original authors of Commons Imaging made a design choice which, in retrospect, 
was unlucky.  The TIFF specification allows data, particularly grayscale data, 
to use a variable number of bits to encode an image.  The elevation data that 
was the inspiration for this feature request is based on 16 bits per sample. 
But the Commons Imaging data readers always convert samples to single bytes.  
Most of the time, this doesn't matter. In some case, such as RGB images, the 
data is already in the form of one byte per sample (3 samples per pixel).

Even though it is common to think of RGB values as consisting of three bytes 
(three "samples", one for each color), there is nothing fundamental about the 
specification of 8 bits per sample.  For example, the current GOES-R generation 
of weather satellites use 12-bit imaging channels to give better discrimination 
of cloud and ground radiance values.  Older satellite images frequently used 10 
bits. However, to simplify the code for its various photometric interpreters 
(the classes that map binary data to pixel colors for rendering images), the 
original implementation built in an adjustment that always converts sample 
values to bytes before passing them on to the photometric interpreters.  If you 
look at the DataReaderStrips and DataReaderTiles classes, you'll see methods 
called getSamplesAsBytes() that do this operation.

For elevation products, this conversion has the consequence if throwing away 
most of the meaningful information in the source data.  

Anyway, addressing this issue requires one of two approaches.  One idea is to 
get rid of the sample conversion and upgrade the photometric interpreters to 
handle the data correctly.  But this change would mean changing multiple 
photometric interpreters, some of which (CieLab, LogLuv, YcbCR) are quite 
complicated. The alternative is to implement a special block of code and a 
special processing rule. This is the approach I propose to implement.

I propose the following: The Commons Imaging API permits an application to pass 
in a custom photometric interpreter. The new feature will implement a special 
processing rule so that, when the application passes in a customer photometric 
interpreter, the data readers will retain the full precision for the samples. 
Eight bit samples will stay 8 bits. Four bit samples will stay 4. And the 16 
bit samples in the elevation products will stay 16. Presumably, an application 
that takes the trouble to supply a custom photometric interpreter will want to 
handle the data exactly as it appears in its source TIFF files.  And, since the 
ability to specify custom photometric interpreters is a relatively new feature 
(only a few months old), it is unlikely that this change will interfere with 
any existing code.

Please let me know if you have any suggestions or insights that I may have 
missed.

Thanks



> Read integer data  from GeoTIFFS
> --------------------------------
>
>                 Key: IMAGING-266
>                 URL: https://issues.apache.org/jira/browse/IMAGING-266
>             Project: Commons Imaging
>          Issue Type: New Feature
>          Components: Format: TIFF
>    Affects Versions: 1.0-alpha3
>            Reporter: Gary Lucas
>            Priority: Major
>
> I recently discovered that there is a large amount of digital elevation data 
> available in the form of 16-bit integer coded data in GeoTIFF files (TIFF 
> files with geographic tags).  I propose to enhance the Commons Imaging API to 
> read these files.  This work will be similar to the work I did for reading 
> floating-point raster data under ISSUE-251.
> Available data include the nearly-global coverage of one-second of arc 
> elevation data produced from the Shuttle Radar Topography Mission (SRTM) and 
> other sources. These products give grids of elevation data with a 30 meter 
> cell spacing for most of the world's land masses. They are available at NASA 
> Earthdata and Japan Space Systems websites, see 
> [https://asterweb.jpl.nasa.gov/gdem.asp|https://asterweb.jpl.nasa.gov/gdem.asp]
>  There is also a ocean bathymetry data set available in this format at 
> [http://www.shadedrelief.com/blue-earth/]
> This new feature will continue to expand the usefulness of the Commons 
> Imaging API in accessing GeoTIFF products.
> Request for Feedback
> So far, the data products I've found (ASTER and Blue Earth Bathymetry) give 
> elevation and ocean depth data in meters recorded as a short integer.  I 
> haven't found an example of where the 32-bit integer format is used.  For 
> now, I am planning on only coding the 16-bit integer variation.  Does anyone 
> know if the 32-bit version is worth supporting?  My criteria for determining 
> that would be based on whether there is a significant number of projects 
> using that format (life is too short to chase rarely used data formats).
> Currently, one of the code-analysis operations conducted by the Commons 
> Imaging build process is coverage by JUnit tests.  Lacking any test data for 
> the 32-bit case, I am reluctant to include it in the code base because it 
> would mean putting uncovered code into the distribution.
> Also, I am wondering about the best design for the access API.  The current 
> TiffImageParser class has a method called getFloatingPointRasterData() that 
> returns an instance of TiffRasterData.  TiffRasterData is pretty much 
> hard-wired to floating point data.  I am thinking of creating a new method 
> called getIntegerRasterData() that would return an instance of a new class 
> called TiffIntegerRasterData. Does this seem reasonable?  I considered trying 
> to combine both kinds of results into a unified class and method, but it 
> seems more unwieldy than useful. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IMAGING-266) Read integer data from GeoTIFFS

Reply via email to