Apache9 commented on a change in pull request #3748:
URL: https://github.com/apache/hbase/pull/3748#discussion_r727647509
##########
File path:
hbase-compression/hbase-compression-zstd/src/main/java/org/apache/hadoop/hbase/io/compress/zstd/ZstdCodec.java
##########
@@ -123,4 +137,42 @@ static int getBufferSize(Configuration conf) {
return size > 0 ? size : 256 * 1024; // Don't change this default
}
+ static LoadingCache<Configuration,byte[]> CACHE = CacheBuilder.newBuilder()
Review comment:
Using Configuration as the key makes me a bit nervous, although after
checking the code, there is no hashCode and equals methods in Configuration so
it will perform like IdentityHashMap...
So is it possible to use the file name as the map key here? I suppose
different tables could use the same dict.
##########
File path:
hbase-compression/hbase-compression-zstd/src/main/java/org/apache/hadoop/hbase/io/compress/zstd/ZstdCodec.java
##########
@@ -123,4 +137,42 @@ static int getBufferSize(Configuration conf) {
return size > 0 ? size : 256 * 1024; // Don't change this default
}
+ static LoadingCache<Configuration,byte[]> CACHE = CacheBuilder.newBuilder()
+ .maximumSize(100)
+ .expireAfterAccess(1, TimeUnit.HOURS)
+ .build(
+ new CacheLoader<Configuration,byte[]>() {
+ public byte[] load(Configuration conf) throws Exception {
+ final String s = conf.get(ZSTD_DICTIONARY_FILE_KEY);
+ if (s == null) {
+ throw new IllegalArgumentException(ZSTD_DICTIONARY_FILE_KEY + " is
not set");
+ }
+ final Path p = new Path(s);
+ final ByteArrayOutputStream baos = new ByteArrayOutputStream();
+ final byte[] buffer = new byte[8192];
+ try (final FSDataInputStream in = FileSystem.get(p.toUri(),
conf).open(p)) {
Review comment:
Do we need to limit the max dict size here? If an user create a table
with a very large dict file, it could bring down the whole cluster if we do not
truncate here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]