[jira] [Commented] (HBASE-17623) Reuse the bytes array when building the hfile block

Anoop Sam John (JIRA) Tue, 21 Mar 2017 10:35:21 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-17623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934954#comment-15934954
 ]


Anoop Sam John commented on HBASE-17623:
----------------------------------------

Patch looks good.
{code}
   * Creates a new {@link HFile} block from the given fields. This constructor
   * is used when the block data has already been read and uncompressed,
   * and is sitting in a byte buffer and we want to stuff the block into cache.
   * See {@link Writer#getBlockForCaching(CacheConfig)}.
   *
{code}
In HFileBlock I can see above javadoc.  But we dont use the API 
getBlockForCaching() for caching a read block right.  Did some search in the 
code and seems we dont. Pls double check.  Pls correct the javadoc.  Around 
this method add a note that this should be used only while writing blocks and 
caching and mention abt the copy we do. Any way the name says it. Just to be 
sure for future ref.

> Reuse the bytes array when building the hfile block
> ---------------------------------------------------
>
>                 Key: HBASE-17623
>                 URL: https://issues.apache.org/jira/browse/HBASE-17623
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Chia-Ping Tsai
>            Assignee: Chia-Ping Tsai
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: after(snappy_hfilesize=5.04GB).png, 
> after(snappy_hfilesize=755MB).png, before(snappy_hfilesize=5.04GB).png, 
> before(snappy_hfilesize=755MB).png, GC measurement.xlsx, 
> HBASE-17623.branch-1.v0.patch, HBASE-17623.branch-1.v1.patch, 
> HBASE-17623.branch-1.v2.patch, HBASE-17623.branch-1.v2.patch, 
> HBASE-17623.v0.patch, HBASE-17623.v1.patch, HBASE-17623.v1.patch, 
> HBASE-17623.v2.patch, memory allocation measurement.xlsx
>
>
> There are two improvements.
> # The onDiskBlockBytesWithHeader should maintain a bytes array which can be 
> reused when building the hfile.
> # The onDiskBlockBytesWithHeader is copied to an new bytes array only when we 
> need to cache the block.
> # If no block need to be cached, the uncompressedBlockBytesWithHeader will 
> never be created.
> {code:title=HFileBlock.java|borderStyle=solid}
>     private void finishBlock() throws IOException {
>       if (blockType == BlockType.DATA) {
>         this.dataBlockEncoder.endBlockEncoding(dataBlockEncodingCtx, 
> userDataStream,
>             baosInMemory.getBuffer(), blockType);
>         blockType = dataBlockEncodingCtx.getBlockType();
>       }
>       userDataStream.flush();
>       // This does an array copy, so it is safe to cache this byte array when 
> cache-on-write.
>       // Header is still the empty, 'dummy' header that is yet to be filled 
> out.
>       uncompressedBlockBytesWithHeader = baosInMemory.toByteArray();
>       prevOffset = prevOffsetByType[blockType.getId()];
>       // We need to set state before we can package the block up for 
> cache-on-write. In a way, the
>       // block is ready, but not yet encoded or compressed.
>       state = State.BLOCK_READY;
>       if (blockType == BlockType.DATA || blockType == BlockType.ENCODED_DATA) 
> {
>         onDiskBlockBytesWithHeader = dataBlockEncodingCtx.
>             compressAndEncrypt(uncompressedBlockBytesWithHeader);
>       } else {
>         onDiskBlockBytesWithHeader = defaultBlockEncodingCtx.
>             compressAndEncrypt(uncompressedBlockBytesWithHeader);
>       }
>       // Calculate how many bytes we need for checksum on the tail of the 
> block.
>       int numBytes = (int) ChecksumUtil.numBytes(
>           onDiskBlockBytesWithHeader.length,
>           fileContext.getBytesPerChecksum());
>       // Put the header for the on disk bytes; header currently is 
> unfilled-out
>       putHeader(onDiskBlockBytesWithHeader, 0,
>           onDiskBlockBytesWithHeader.length + numBytes,
>           uncompressedBlockBytesWithHeader.length, 
> onDiskBlockBytesWithHeader.length);
>       // Set the header for the uncompressed bytes (for cache-on-write) -- 
> IFF different from
>       // onDiskBlockBytesWithHeader array.
>       if (onDiskBlockBytesWithHeader != uncompressedBlockBytesWithHeader) {
>         putHeader(uncompressedBlockBytesWithHeader, 0,
>           onDiskBlockBytesWithHeader.length + numBytes,
>           uncompressedBlockBytesWithHeader.length, 
> onDiskBlockBytesWithHeader.length);
>       }
>       if (onDiskChecksum.length != numBytes) {
>         onDiskChecksum = new byte[numBytes];
>       }
>       ChecksumUtil.generateChecksums(
>           onDiskBlockBytesWithHeader, 0, onDiskBlockBytesWithHeader.length,
>           onDiskChecksum, 0, fileContext.getChecksumType(), 
> fileContext.getBytesPerChecksum());
>     }{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-17623) Reuse the bytes array when building the hfile block

Reply via email to