Github user keith-turner commented on a diff in the pull request:
https://github.com/apache/accumulo/pull/106#discussion_r65714941
--- Diff:
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
---
@@ -86,12 +86,22 @@ public void flush() throws IOException {
public static final String COMPRESSION_NONE = "none";
/**
- * Compression algorithms.
+ * Compression algorithms. There is a static initializer, below the
values defined in the enumeration, that calls the initializer of all defined
codecs within
+ * the Algorithm enum. This promotes a model of the following call graph
of initialization by the static initializer, followed by calls to getCodec() and
+ * createCompressionStream/DecompressionStream. In some cases, the
compression and decompression call methods will include a different buffer size
for the
+ * stream. Note that if the compressed buffer size requested in these
calls is zero, we will not set the buffer size for that algorithm. Instead, we
will use
+ * the default within the codec.
+ *
+ * There is a Guava cache defined within Algorithm that allows us to
cache Codecs for re-use. Since they are immutable, there is no concern for
using them
+ * concurrently; however, the Guava cache exists to ensure a maximal
size of the cache and efficient and concurrent read/write access to the cache
itself.
--- End diff --
@phrocker a few days ago you told me offline that the Codecs maintain an
internal reference to a Configuration object. This was why the old code would
sync, change the config, then create the stream. When you explained this
everything clicked for me. You did not find that w/o some spelunking, might be
useful to work this insight into the comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---