[
https://issues.apache.org/jira/browse/COMPRESS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092467#comment-14092467
]
Stefan Bodewig commented on COMPRESS-285:
-----------------------------------------
Thanks Sebb, I think your two suggestions are good ideas and will see to
implementing them the coming week, in particular you will only pay for the
failed XZ check if you are really trying to uncompress XZ streams.
The additional constructor won't help Wojciech since he's using Compress behind
Tika, Tika would need to get adapted to the new constuctor and in the end
implement its own logic which would also need to take OSGi contexts into
account. I think it might be a good idea to add an explicit flag whether the
result is cacheable and make that flag default to true unless BundleEvent can
be loaded - Wojciech would then need to set the flag explicitly.
> checking of availability of XZ compression is expensive - result should be
> reused
> ---------------------------------------------------------------------------------
>
> Key: COMPRESS-285
> URL: https://issues.apache.org/jira/browse/COMPRESS-285
> Project: Commons Compress
> Issue Type: Improvement
> Components: Compressors
> Affects Versions: 1.5, 1.6, 1.7, 1.8
> Environment: linux 64-bit, java 7, glassfish, solr, tika
> Reporter: Wojciech Ćozowicki
> Priority: Minor
> Labels: performance
>
> I use solr with apache tika for indexing documents. Tika uses
> commons-compress to handle compressed files. Using sampler (jvisualvm) I have
> seen that quite a lot of time (5-7%) during my tests is spent in
> XZUtils.isXZCompressionAvailable because of unavailable XZ compression (I
> guess for each time classloaders spend some time looking for unavailable
> classes, then NoClassDefFoundError).
> I think the result of the first check should be stored and reused.
> Here is the stacktrace (just to show the way tika is using commons-compress):
> org.apache.commons.compress.compressors.xz.XZUtils.isXZCompressionAvailable(XZUtils.java:52)
> at
> org.apache.commons.compress.compressors.CompressorStreamFactory.createCompressorInputStream(CompressorStreamFactory.java:140)
> at
> org.apache.tika.parser.pkg.ZipContainerDetector.detectCompressorFormat(ZipContainerDetector.java:95)
> at
> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:81)
> at
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
--
This message was sent by Atlassian JIRA
(v6.2#6252)