[GitHub] [hbase] apurtell commented on a change in pull request #3244: HBASE-25869 WAL value compression

GitBox Tue, 11 May 2021 19:07:44 -0700


apurtell commented on a change in pull request #3244:
URL: https://github.com/apache/hbase/pull/3244#discussion_r630669258




##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java
##########
@@ -256,6 +267,61 @@ public void write(Cell cell) throws IOException {
         }
       }
     }
+
+    private byte[] compressValue(Cell cell) throws IOException {
+      Deflater deflater = compression.getValueCompressor().getDeflater();
+      if (cell instanceof ByteBufferExtendedCell) {
+        
deflater.setInput(((ByteBufferExtendedCell)cell).getValueByteBuffer().array(),
+          ((ByteBufferExtendedCell)cell).getValueByteBuffer().arrayOffset() +
+          ((ByteBufferExtendedCell)cell).getValuePosition(),
+          cell.getValueLength());
+      } else {
+        deflater.setInput(cell.getValueArray(), cell.getValueOffset(), 
cell.getValueLength());
+      }
+      ByteArrayOutputStream baos = new ByteArrayOutputStream();
+      int bufferSize = 1024;
+      byte[] buffer = new byte[bufferSize];
+      // Deflater#deflate will return 0 only if more input is required. We 
iterate until
+      // that condition is met, sending the content of 'buffer' to the output 
stream at
+      // each step, until deflate returns 0. Then the compressor must be 
flushed in order
+      // for all of the value's output to be written into the corresponding 
edit. (Otherwise
+      // the compressor would carry over some of the output for this value 
into the output
+      // of the next.) To flush the compressor we call deflate again using the 
method option
+      // that allows us to specify the SYNC_FLUSH flag. The sync output will 
be placed into
+      // the buffer. When flushing we iterate until there is no more output. 
Then the flush
+      // is complete and the compressor is ready for more input.
+      int bytesOut;

Review comment:
       I had to back out this change:
   
   Exception in thread "AsyncFSWAL-0-hdfs://localhost:8020/hbase/MasterData" 
java.lang.AssertionError: should not happen
        at 
org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.append(AsyncProtobufLogWriter.java:148)
        at 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doAppend(AsyncFSWAL.java:773)
        at 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doAppend(AsyncFSWAL.java:130)
        at 
org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.appendEntry(AbstractFSWAL.java:1016)
        at 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.appendAndSync(AsyncFSWAL.java:468)
        at 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.consume(AsyncFSWAL.java:556)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: java.io.IOException: write beyond end of stream
        at 
java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:201)
        at 
org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvEncoder.compressValue(WALCellCodec.java:282)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] apurtell commented on a change in pull request #3244: HBASE-25869 WAL value compression

Reply via email to