[
https://issues.apache.org/jira/browse/IMAGING-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092828#comment-17092828
]
Michael Osipov edited comment on IMAGING-257 at 4/26/20, 7:10 PM:
------------------------------------------------------------------
Working on it. Got a few other issues in the queue to take care of first.
In the meantime, I welcome any insights or helpful suggestions from the user
base.
was (Author: gwlucas):
Working on it. Got a few other issues in the queue to take care of first.
In the meantime, I welcome any insights or helpful suggestions from the user
base.
> Investigate speed improvements to LZW decompression
> ---------------------------------------------------
>
> Key: IMAGING-257
> URL: https://issues.apache.org/jira/browse/IMAGING-257
> Project: Commons Imaging
> Issue Type: Improvement
> Components: Format: TIFF
> Reporter: Gary Lucas
> Priority: Minor
>
> In accessing large TIFF files (10812-by-10812 pixels), read times were about
> 11 seconds (with a solid-state disk drive), and I was looking for ways to
> reduce that. I ran the Netbeans profiler and discovered that 87% of the read
> time was spent in the MyLzwDecompressor decompress() method.
> Inspecting MyLzwDecompressor, I saw that it used the Java
> ByteArrayOutputStream, which is kind of famous for being slow. You can find
> lots of examples of classes named FastByteArrayOutputStream on the web,
> including one right here in the Commons Imaging project.
> I tried a number of different experiments using the
> ApacheImagingSpeedAndMemoryTest class (from the examples directory).
> Replacing ByteArrayOutputStream with FastByteArrayOutputStream produced a 4
> percent reduction in run time.
> I then tried using a local array instead of a "byte array" class. That
> improved things to about a 8 percent reduction in time. Finally, I tried a
> few more aggressive changes, removing the number of conditional tests and
> replacing calls such as stringFromCode() which wrappers the class member
> "table" with direct access. Final result was a 11 percent total reduction in
> time.
> 11 percent isn't all the impressive, but I haven't been able to find anything
> else. Modern compilers are so smart and do such a good job optimizing code,
> that it's hard to find "easy wins."
> Anyway, this is a potential area for improvement in the Commons Imaging API.
> Care will be required because there are some features that my test bypassed.
> For example, there's a diagnostic "listener" in the current implementation
> that would have to be supported. Also, I took out a lot of bounds checking,
> and just assumed that the input compressed data would produce correct output.
> In real life, that's not a safe assumption. I would probably try wrapping
> the logic of the decompress method in a try{}catch{} block looking for
> ArrayIndexOutOfBounds and have the method re-throw it as an IOException
> (which is what it does now). It will also be challenging to find a way of
> properly testing modifications to this class.
> I will be looking at this, but probably will not move on it until I get
> feedback from the community. I don't view this change as unduly risky
> provided that proper care is taken in making the modifications. But the gain
> in performance is small enough, that I'm not sure it's worth it.
> I will also take a look at Commons Compression to see what they do.
> If you have any thoughts on this matter, please let me know.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)