[ 
https://issues.apache.org/jira/browse/HBASE-22532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855447#comment-16855447
 ] 

Zheng Hu commented on HBASE-22532:
----------------------------------

OK, seems all of the block whose size > 65KB are BLOOM_CHUNK data type : 
{code}
$ cat out.log  | grep 'onDiskSizeWithoutHeader=1'  | head -n 50
[blockType=BLOOM_CHUNK, fileOffset=16553869, headerSize=33, 
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, 
prevBlockOffset=-1, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, 
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], 
dataBeginsWith=\xEF\x0Eg\xFC.>[\x0FLEF\xD9\xBF\x155\xC3Cdq=\x8E\xDB\xBDn\x0A\x17\x1DS\x08\xB7\xAC\xE2,
 fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65690]
[blockType=BLOOM_CHUNK, fileOffset=33172900, headerSize=33, 
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, 
prevBlockOffset=16553869, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, 
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], 
dataBeginsWith=\xD7\xD9\x06\xF3\x16\xF4\x04\xAF5\xFD$\x0B\x8B\xFCC\xE8\xB1\xCE\x9Cb\x14?cDR|\xE9w\x9F\xCE4T,
 fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65692]
[blockType=BLOOM_CHUNK, fileOffset=49792391, headerSize=33, 
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, 
prevBlockOffset=33172900, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, 
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], 
dataBeginsWith=%T\x18\xEC\x11\xB3\xFB\xCC\xC1\x0E\xEA\xCD\x11\x8E\xD67\xD1\xEB//lB\xB2aO}/\xFB\xEC\xF4\xB9\x1A,
 fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65695]
[blockType=BLOOM_CHUNK, fileOffset=66477402, headerSize=33, 
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, 
prevBlockOffset=49792391, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, 
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], 
dataBeginsWith=\x0E\x00\xB0?>\xB5\xA8m\x0D\xAAY\x0EHc\xA1\xB4\xAD\xF9\xED\xC3\xB7gH\xB7\xCB&w\xFC\x8E\xA3\xE2\x0E,
 fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65686]
{code}

> There's still too much cpu wasting on validating checksum even if 
> buffer.size=65KB
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-22532
>                 URL: https://issues.apache.org/jira/browse/HBASE-22532
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>         Attachments: async-prof-pid-27827-cpu-3.svg, 
> async-prof-pid-64695-cpu-1.svg
>
>
> After disabled the block cache, and with the following config: 
> {code}
>     # Disable the block cache
>     hfile.block.cache.size=0
>     hbase.ipc.server.allocator.buffer.size=66560
>     hbase.ipc.server.reservoir.minimal.allocating.size=0
> {code}
> The ByteBuff for block should be expected to be a SingleByteBuff,  which will 
> use the hadoop native lib to validate the checksum, while in the cpu flame 
> graph 
> [async-prof-pid-27827-cpu-3.svg|https://issues.apache.org/jira/secure/attachment/12970683/async-prof-pid-27827-cpu-3.svg],
>   we can still see that about 32% CPU wasted on PureJavaCrc32#update,  which 
> means it's not using the faster hadoop native lib.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to