ctubbsii commented on a change in pull request #2221:
URL: https://github.com/apache/accumulo/pull/2221#discussion_r682810271



##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -250,33 +277,71 @@ public void initializeDefaultCodec() {
        */
       @Override
       protected CompressionCodec createNewCodec(final int bufferSize) {
-        DefaultCodec myCodec = new DefaultCodec();
-        Configuration myConf = new Configuration(conf);
-        // only use the buffersize if > 0, otherwise we'll use
-        // the default defined within the codec
-        if (bufferSize > 0)
-          myConf.setInt(BUFFER_SIZE_OPT, bufferSize);
-        myCodec.setConf(myConf);
-        return myCodec;
+        if (USE_QAT) {
+          String extClazz =
+              (conf.get(CONF_QAT_CLASS) == null ? 
System.getProperty(CONF_QAT_CLASS) : null);
+          String clazz = (extClazz != null) ? extClazz : DEFAULT_QAT_CLASS;
+          try {
+            LOG.info("Trying to load qat codec class: " + clazz);
+
+            Configuration myConf = new Configuration(conf);
+            // only use the buffersize if > 0, otherwise we'll use
+            // the default defined within the codec
+            if (bufferSize > 0)
+              myConf.setInt(QAT_BUFFER_SIZE_OPT, bufferSize);
+
+            CompressionCodec c =
+                (CompressionCodec) 
ReflectionUtils.newInstance(Class.forName(clazz), myConf);

Review comment:
       Is `Class.forName` the right way to load this class for the VFS 
Classloader in Accumulo in 1.x? What about 2.1 with the new classloader changes 
there?

##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -219,7 +220,21 @@ public OutputStream createCompressionStream(OutputStream 
downStream, Compressor
 
     GZ(COMPRESSION_GZ) {
 
-      private transient DefaultCodec codec = null;
+      private static final String USE_QAT_PROPERTY = "use.qat";

Review comment:
       `use.qat` is succinct, but it might be better to have a more explicit 
name in the `accumulo.` namespace because this is for affecting Accumulo 
behavior, something like:
   
   ```suggestion
         private static final String USE_QAT_PROPERTY = 
"accumulo.rfile.compression.quickassist.enabled";
   ```

##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -231,14 +246,26 @@ public OutputStream createCompressionStream(OutputStream 
downStream, Compressor
        */
       private static final int DEFAULT_BUFFER_SIZE = 32 * 1024;
 
+      private final Boolean USE_QAT =
+          Boolean.valueOf(System.getProperty(USE_QAT_PROPERTY, "false"));

Review comment:
       Since this will return something non-null, you can assign it to a 
primitive `boolean`. Also, to support `-Duse.qat` in addition to 
`-Duse.qat=true`, this logic will need to be adjusted a bit. Could see how 
Maven does it, because it supports similar flags like `-DskipTests` where you 
don't have to specify `-DskipTests=true`.

##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -250,33 +277,71 @@ public void initializeDefaultCodec() {
        */
       @Override
       protected CompressionCodec createNewCodec(final int bufferSize) {
-        DefaultCodec myCodec = new DefaultCodec();
-        Configuration myConf = new Configuration(conf);
-        // only use the buffersize if > 0, otherwise we'll use
-        // the default defined within the codec
-        if (bufferSize > 0)
-          myConf.setInt(BUFFER_SIZE_OPT, bufferSize);
-        myCodec.setConf(myConf);
-        return myCodec;
+        if (USE_QAT) {
+          String extClazz =
+              (conf.get(CONF_QAT_CLASS) == null ? 
System.getProperty(CONF_QAT_CLASS) : null);
+          String clazz = (extClazz != null) ? extClazz : DEFAULT_QAT_CLASS;
+          try {
+            LOG.info("Trying to load qat codec class: " + clazz);
+
+            Configuration myConf = new Configuration(conf);
+            // only use the buffersize if > 0, otherwise we'll use
+            // the default defined within the codec
+            if (bufferSize > 0)
+              myConf.setInt(QAT_BUFFER_SIZE_OPT, bufferSize);
+
+            CompressionCodec c =
+                (CompressionCodec) 
ReflectionUtils.newInstance(Class.forName(clazz), myConf);
+            if (c instanceof Configurable) {
+              ((Configurable) c).setConf(myConf);
+            }
+            return c;
+          } catch (ClassNotFoundException e) {
+            LOG.error("Unable to create QAT codec", e);
+          }
+          return null;

Review comment:
       If this is null, should we fall back to the default impl? This seems 
like it would result in NPEs and break things in weird ways. The QuickAssist 
impl should probably be an optional optimization that works when available, but 
is ignored when not available.

##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -250,33 +277,71 @@ public void initializeDefaultCodec() {
        */
       @Override
       protected CompressionCodec createNewCodec(final int bufferSize) {
-        DefaultCodec myCodec = new DefaultCodec();
-        Configuration myConf = new Configuration(conf);
-        // only use the buffersize if > 0, otherwise we'll use
-        // the default defined within the codec
-        if (bufferSize > 0)
-          myConf.setInt(BUFFER_SIZE_OPT, bufferSize);
-        myCodec.setConf(myConf);
-        return myCodec;
+        if (USE_QAT) {
+          String extClazz =
+              (conf.get(CONF_QAT_CLASS) == null ? 
System.getProperty(CONF_QAT_CLASS) : null);
+          String clazz = (extClazz != null) ? extClazz : DEFAULT_QAT_CLASS;

Review comment:
       I'm a little confused by this logic. This reads:
   
   If `io.compression.codec.qat.class` is not set in the Hadoop config file, 
then use the class name defined by 
`-Dorg.apache.hadoop.io.compress.QatCodec.class=????`. This part makes sense, 
although I don't know why we'd want to be able to override the QAT codec class 
name, since it seems like it would be a well-known value.
   
   If it **is** set in the Hadoop config file, then just set the class name to 
`null`, which will result in us using the default class name we've hard-coded. 
This part doesn't make sense to me. Why wouldn't we use the class name from the 
Hadoop config file in this case?
   
   Also, shouldn't we prioritize our explicit `-D` override instead of only 
using it when the Hadoop config doesn't set it? The implementation here will 
prioritize what's in the the default value if the Hadoop config is set, and 
ignore the override.
   
   I think the logic should be:
   
   1. Check if system property is set to non-null. Use it if set, otherwise
   2. check if Hadoop config is set to non-null. Use it if set, otherwise
   3. use the default class name that we've hard-coded.
   
   You could also use the first one in that order where the class isn't just 
non-null, but also the class loads.
   
   Also, minor nit: avoiding unnecessary negations in ternary statements helps 
readability:
   ```java
   String clazz = extClazz == null ? DEFAULT_QAT_CLASS : extClazz;
   ```

##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -250,33 +277,71 @@ public void initializeDefaultCodec() {
        */
       @Override
       protected CompressionCodec createNewCodec(final int bufferSize) {
-        DefaultCodec myCodec = new DefaultCodec();
-        Configuration myConf = new Configuration(conf);
-        // only use the buffersize if > 0, otherwise we'll use
-        // the default defined within the codec
-        if (bufferSize > 0)
-          myConf.setInt(BUFFER_SIZE_OPT, bufferSize);
-        myCodec.setConf(myConf);
-        return myCodec;
+        if (USE_QAT) {
+          String extClazz =
+              (conf.get(CONF_QAT_CLASS) == null ? 
System.getProperty(CONF_QAT_CLASS) : null);
+          String clazz = (extClazz != null) ? extClazz : DEFAULT_QAT_CLASS;
+          try {
+            LOG.info("Trying to load qat codec class: " + clazz);
+
+            Configuration myConf = new Configuration(conf);
+            // only use the buffersize if > 0, otherwise we'll use
+            // the default defined within the codec
+            if (bufferSize > 0)
+              myConf.setInt(QAT_BUFFER_SIZE_OPT, bufferSize);
+
+            CompressionCodec c =
+                (CompressionCodec) 
ReflectionUtils.newInstance(Class.forName(clazz), myConf);
+            if (c instanceof Configurable) {
+              ((Configurable) c).setConf(myConf);
+            }
+            return c;
+          } catch (ClassNotFoundException e) {

Review comment:
       Is CNFE is the only ReflectiveOperationException that can be thrown 
here? I'm surprised if that's the case, but perhaps this is because the rest 
are caught in Hadoop's ReflectionUtils? (Is that really public API for Hadoop? 
That seems like an internal utility class that we should probably avoid using 
ourselves... but it's outside the scope of this PR.)

##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -250,33 +277,71 @@ public void initializeDefaultCodec() {
        */
       @Override
       protected CompressionCodec createNewCodec(final int bufferSize) {
-        DefaultCodec myCodec = new DefaultCodec();
-        Configuration myConf = new Configuration(conf);
-        // only use the buffersize if > 0, otherwise we'll use
-        // the default defined within the codec
-        if (bufferSize > 0)
-          myConf.setInt(BUFFER_SIZE_OPT, bufferSize);
-        myCodec.setConf(myConf);
-        return myCodec;
+        if (USE_QAT) {
+          String extClazz =
+              (conf.get(CONF_QAT_CLASS) == null ? 
System.getProperty(CONF_QAT_CLASS) : null);
+          String clazz = (extClazz != null) ? extClazz : DEFAULT_QAT_CLASS;
+          try {
+            LOG.info("Trying to load qat codec class: " + clazz);
+
+            Configuration myConf = new Configuration(conf);
+            // only use the buffersize if > 0, otherwise we'll use
+            // the default defined within the codec
+            if (bufferSize > 0)
+              myConf.setInt(QAT_BUFFER_SIZE_OPT, bufferSize);
+
+            CompressionCodec c =
+                (CompressionCodec) 
ReflectionUtils.newInstance(Class.forName(clazz), myConf);
+            if (c instanceof Configurable) {
+              ((Configurable) c).setConf(myConf);
+            }
+            return c;
+          } catch (ClassNotFoundException e) {
+            LOG.error("Unable to create QAT codec", e);
+          }
+          return null;
+        } else {
+          DefaultCodec myCodec = new DefaultCodec();
+          Configuration myConf = new Configuration(conf);
+          // only use the buffersize if > 0, otherwise we'll use
+          // the default defined within the codec
+          if (bufferSize > 0)
+            myConf.setInt(BUFFER_SIZE_OPT, bufferSize);
+          myCodec.setConf(myConf);
+          return myCodec;
+        }
       }
 
       @Override
       public InputStream createDecompressionStream(InputStream downStream,
           Decompressor decompressor, int downStreamBufferSize) throws 
IOException {
-        // Set the internal buffer size to read from down stream.
-        CompressionCodec decomCodec = codec;
-        // if we're not using the default, let's pull from the loading cache
-        if (DEFAULT_BUFFER_SIZE != downStreamBufferSize) {
-          Entry<Algorithm,Integer> sizeOpt = Maps.immutableEntry(GZ, 
downStreamBufferSize);
-          try {
-            decomCodec = codecCache.get(sizeOpt);
-          } catch (ExecutionException e) {
-            throw new IOException(e);
+
+        if (USE_QAT) {

Review comment:
       This conditional is checking that 'use.qat' system property is set to 
true, but not whether the QAT codec successfully loaded. The conditional should 
probably depend on the success of loading the QAT library, especially if we 
fall back to the regular codec if it fails to load, as I suggested elsewhere.

##########
File path: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/Compression.java
##########
@@ -297,7 +362,7 @@ public OutputStream createCompressionStream(OutputStream 
downStream, Compressor
 
       @Override
       public boolean isSupported() {
-        return true;
+        return codec != null;

Review comment:
       We should always support GZ, even if the QAT library fails to load.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to