[
https://issues.apache.org/jira/browse/SANSELAN-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267182#comment-13267182
]
Damjan Jovanovic commented on SANSELAN-78:
------------------------------------------
Well what we really want here is an interface that will allow seeking as well
as I/O on any backend representation (byte[], InputStream or File). Such an
interface doesn't exist in Java - RandomAccessFile and FileChannel both require
local files, while InputStream doesn't allow seeking.
Ideally we'd have a SeekableInputStream and some way to get it from a
ByteSource and then keep reusing it.
> Improve speed of random-access-file handling for TIFF format, potentially
> others
> --------------------------------------------------------------------------------
>
> Key: SANSELAN-78
> URL: https://issues.apache.org/jira/browse/SANSELAN-78
> Project: Commons Sanselan
> Issue Type: Improvement
> Components: Format: TIFF
> Reporter: Gary Lucas
>
> Large TIFF files can be organized into chunks (either strips or tiles) so
> that the image can be read a piece-at-a-time. In the Apache Imaging
> implementation, each time one of these pieces is read, the TiffReader uses
> the getBlock() method of the ByteSourceFile class. This class opens the file
> using the Java RandomAccessFile class, seeks to the position of the data in
> the file, reads its content, and closes the file. Although this operation
> can be performed several times and thus entails a lot of redundant file opens
> and reads, the file cache performance on modern computers is truly amazing
> and for files of less than 5 megabytes, it often doesn't make a difference.
> On larger files, however, it can be significant.
> This Tracker Item proposes to modify the ByteSourceFile class so that an
> access routine can optionally hold the file open between getBlock() method
> calls. It will accomplish this by adding a new method called
> .setPersistent(boolean). By default, persistence will be set to false and
> the ByteSourceFile class will continue to work just as it always has
> (existing code will not be affected). If persistence is set to true, the
> RandomAccessFile will be held open.
> To get some sense of the performance difference, I ran several tests. For
> the sample "ron and andy.tif" file provided with the Apache Imaging package,
> which is under 5 megabytes, the change made little difference. However,
> when I tested with a larger files, such as the Apache Imaging sample
> 2560-by-1920 pixel PICT2833.TIF file (a blurry picture of a pretty girl),
> and a 2500-by-2500 pixel file I downloaded from the US Geological Survey
> (USGS), I saw notable differences.
> I also tested on a fast local disk (my PC) and on a network disk. Not
> surprisingly, the network disk showed the biggest change (in order to keep
> the test environment clean, I ran the network test early in the morning when
> the network was lightly used).
> As you can see in the tests below on the local disk the savings is modest
> even for the largest file. However, when dealing with a network file system,
> the change becomes significant.
> {code}
> ron and andy.tif 1500-by-1125 4.8 MB
> local original: 25.9 ms.
> local modified: 24.8 ms.
> network original: 122.7 ms.
> network modified: 117.6 ms.
> PICT2833.TIF 2560-by-1920 14.1 MB
> local original: 77.7 ms.
> local modified: 61.7 ms.
> network original: 774.1 ms.
> network modified: 463.8 ms.
> USGS1 2500-by-2500 18.8 MB
> local original: 192.3 ms.
> local modified: 94.5 ms.
> network original: 3992.8 ms.
> network modified: 1807.1 ms.
> USGS2 10000-by-10000 286 MB
> local original: 1930.5 ms.
> local modified: 1344.5 ms.
> network original: 26627.6 ms.
> network modified: 13402.1 ms.
> {code}
> One consequence of this change is that if persistence is set to true, the
> file will be held open until the ByteSourceFile goes out-of-scope and is
> garbage collected. So this change will also make sure that the TiffReader
> sets the persistence back to false when it is done reading the file in order
> to expedite the release of file resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira