[GitHub] [hbase] apurtell commented on a change in pull request #3244: HBASE-25869 WAL value compression

GitBox Mon, 10 May 2021 19:22:20 -0700


apurtell commented on a change in pull request #3244:
URL: https://github.com/apache/hbase/pull/3244#discussion_r629801863




##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java
##########
@@ -241,10 +246,27 @@ public void write(Cell cell) throws IOException {
         compression.getDictionary(CompressionContext.DictionaryIndex.FAMILY));
       PrivateCellUtil.compressQualifier(out, cell,
         
compression.getDictionary(CompressionContext.DictionaryIndex.QUALIFIER));
-      // Write timestamp, type and value as uncompressed.
+      // Write timestamp, type and value.
       StreamUtils.writeLong(out, cell.getTimestamp());
-      out.write(cell.getTypeByte());
-      PrivateCellUtil.writeValue(out, cell, cell.getValueLength());
+      byte type = cell.getTypeByte();
+      if (compression.getValueCompressor() != null &&
+          cell.getValueLength() > VALUE_COMPRESS_THRESHOLD) {
+        // Try compressing the cell's value
+        byte[] compressedBytes = compressValue(cell);
+        // Only write the compressed value if we have achieved some space 
savings.
+        if (compressedBytes.length < cell.getValueLength()) {
+          // Set the high bit of type to indicate the value is compressed
+          out.write((byte)(type|0x80));

Review comment:
       It is simpler if we don't try to conditionally compress values. No high 
bit twiddling. So there is no leakage into KeyValue in this case and no magic 
values. 
   
   There also doesn't need to be a size threshold. Since we are using a single 
deflator instance over all values in the WAL file, overheads of too small 
values should be amortized. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] apurtell commented on a change in pull request #3244: HBASE-25869 WAL value compression

Reply via email to