apurtell commented on a change in pull request #3748:
URL: https://github.com/apache/hbase/pull/3748#discussion_r728433970
##########
File path:
hbase-compression/hbase-compression-zstd/src/main/java/org/apache/hadoop/hbase/io/compress/zstd/ZstdCodec.java
##########
@@ -123,4 +137,42 @@ static int getBufferSize(Configuration conf) {
return size > 0 ? size : 256 * 1024; // Don't change this default
}
+ static LoadingCache<Configuration,byte[]> CACHE = CacheBuilder.newBuilder()
Review comment:
This is definitely a concern.
In the latest version of the patch I override hashCode in
CompoundConfiguration so we are doing something better than object identity
when caching the dictionaries for the store writer case. It is kind of
expensive to compute the hashCode given how CompoundConfiguration works but at
least we do not do it that often, and not in performance critical code. Once a
compressor or decompressor is created it is reused for the lifetime of the
reader or writer.
Otherwise we are using object identity. That is not the worst thing, at
least. The cache is capped at 100 and will also expire entries if they are not
used for one hour. (And those parameters can be adjusted to your taste.)
Let me try your suggestion. I was thinking we could avoid doing two lookups
into the Configuration -- to get the boolean, and then the path, for the key --
but that hashCode calculation is pretty expensive. Getting the path from the
configuration object and using that would be less.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]