[
https://issues.apache.org/jira/browse/COMPRESS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054810#comment-14054810
]
Wojciech Łozowicki commented on COMPRESS-285:
---------------------------------------------
I can see three solutions:
- The first one is checking if we are in OSGi context by checking if some base
OSGi-specific class (i.e. org.osgi.framework.BundleEvent) is available (just
like it is done for XZ library, but in this case availability of such class
should not change ). If we are unable to load the class it means that we can
cache results. This approach requires OSGi classes to be available at compile
time. And I am not sure, if it would work for me as I recently noticed that
glassfish runs on top of the OSGi (Apache Felix).
- The second one is explicitly enabling caching of the result at class
(XZUtils) load-time. This could be read from somewhere (I do not know where
exactly from, as it does not seem to be any configuration file for
commons-compress)
- The third one is like the second one, but it is about setting this
caching-availability manually (i.e. from some Application- or
ServletContextListener). This is poor man's solution, but as it is simple
stupid it should not be no more problems in this area. In this case caching
could be used in OSGi context as well. Below you will find suggested change for
this case (the method isXZCompressionAvailableInternal is the current version
of isXZCompressionAvailable).
//shall caching of XZ compressions's availability be used?
private static final AtomicBoolean cacheXZAvailability = new
AtomicBoolean(false);
//cached value of XZ compressions's availability
private static AtomicBoolean evaluatedXZAvailability;
public static void setCacheXZAvailability(boolean newValue) {
evaluatedXZAvailability.set(newValue);
// in case of enabling it is better to reevaluate value to be
cached
// in case of disabling - we do not need it anymore
evaluatedXZAvailability = null;
}
public static boolean isXZCompressionAvailable() {
boolean result;
if (cacheXZAvailability.get()) {
if (evaluatedXZAvailability == null) {
result = isXZCompressionAvailableInternal();
evaluatedXZAvailability = new
AtomicBoolean(result);
} else {
result = evaluatedXZAvailability.get();
}
} else {
result = isXZCompressionAvailableInternal();
}
return result;
}
> checking of availability of XZ compression is expensive - result should be
> reused
> ---------------------------------------------------------------------------------
>
> Key: COMPRESS-285
> URL: https://issues.apache.org/jira/browse/COMPRESS-285
> Project: Commons Compress
> Issue Type: Improvement
> Components: Compressors
> Affects Versions: 1.5, 1.6, 1.7, 1.8
> Environment: linux 64-bit, java 7, glassfish, solr, tika
> Reporter: Wojciech Łozowicki
> Priority: Minor
> Labels: performance
>
> I use solr with apache tika for indexing documents. Tika uses
> commons-compress to handle compressed files. Using sampler (jvisualvm) I have
> seen that quite a lot of time (5-7%) during my tests is spent in
> XZUtils.isXZCompressionAvailable because of unavailable XZ compression (I
> guess for each time classloaders spend some time looking for unavailable
> classes, then NoClassDefFoundError).
> I think the result of the first check should be stored and reused.
> Here is the stacktrace (just to show the way tika is using commons-compress):
> org.apache.commons.compress.compressors.xz.XZUtils.isXZCompressionAvailable(XZUtils.java:52)
> at
> org.apache.commons.compress.compressors.CompressorStreamFactory.createCompressorInputStream(CompressorStreamFactory.java:140)
> at
> org.apache.tika.parser.pkg.ZipContainerDetector.detectCompressorFormat(ZipContainerDetector.java:95)
> at
> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:81)
> at
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
--
This message was sent by Atlassian JIRA
(v6.2#6252)