[
https://issues.apache.org/jira/browse/HBASE-22532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855447#comment-16855447
]
Zheng Hu commented on HBASE-22532:
----------------------------------
OK, seems all of the block whose size > 65KB are BLOOM_CHUNK data type :
{code}
$ cat out.log | grep 'onDiskSizeWithoutHeader=1' | head -n 50
[blockType=BLOOM_CHUNK, fileOffset=16553869, headerSize=33,
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072,
prevBlockOffset=-1, isUseHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105,
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true,
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]],
dataBeginsWith=\xEF\x0Eg\xFC.>[\x0FLEF\xD9\xBF\x155\xC3Cdq=\x8E\xDB\xBDn\x0A\x17\x1DS\x08\xB7\xAC\xE2,
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true,
includesTags=false, compressAlgo=NONE, compressTags=false,
cryptoContext=[cipher=NONE keyHash=NONE],
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65690]
[blockType=BLOOM_CHUNK, fileOffset=33172900, headerSize=33,
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072,
prevBlockOffset=16553869, isUseHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105,
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true,
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]],
dataBeginsWith=\xD7\xD9\x06\xF3\x16\xF4\x04\xAF5\xFD$\x0B\x8B\xFCC\xE8\xB1\xCE\x9Cb\x14?cDR|\xE9w\x9F\xCE4T,
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true,
includesTags=false, compressAlgo=NONE, compressTags=false,
cryptoContext=[cipher=NONE keyHash=NONE],
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65692]
[blockType=BLOOM_CHUNK, fileOffset=49792391, headerSize=33,
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072,
prevBlockOffset=33172900, isUseHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105,
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true,
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]],
dataBeginsWith=%T\x18\xEC\x11\xB3\xFB\xCC\xC1\x0E\xEA\xCD\x11\x8E\xD67\xD1\xEB//lB\xB2aO}/\xFB\xEC\xF4\xB9\x1A,
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true,
includesTags=false, compressAlgo=NONE, compressTags=false,
cryptoContext=[cipher=NONE keyHash=NONE],
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65695]
[blockType=BLOOM_CHUNK, fileOffset=66477402, headerSize=33,
onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072,
prevBlockOffset=49792391, isUseHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105,
getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true,
buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]],
dataBeginsWith=\x0E\x00\xB0?>\xB5\xA8m\x0D\xAAY\x0EHc\xA1\xB4\xAD\xF9\xED\xC3\xB7gH\xB7\xCB&w\xFC\x8E\xA3\xE2\x0E,
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C,
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true,
includesTags=false, compressAlgo=NONE, compressTags=false,
cryptoContext=[cipher=NONE keyHash=NONE],
name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65686]
{code}
> There's still too much cpu wasting on validating checksum even if
> buffer.size=65KB
> ----------------------------------------------------------------------------------
>
> Key: HBASE-22532
> URL: https://issues.apache.org/jira/browse/HBASE-22532
> Project: HBase
> Issue Type: Sub-task
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
> Attachments: async-prof-pid-27827-cpu-3.svg,
> async-prof-pid-64695-cpu-1.svg
>
>
> After disabled the block cache, and with the following config:
> {code}
> # Disable the block cache
> hfile.block.cache.size=0
> hbase.ipc.server.allocator.buffer.size=66560
> hbase.ipc.server.reservoir.minimal.allocating.size=0
> {code}
> The ByteBuff for block should be expected to be a SingleByteBuff, which will
> use the hadoop native lib to validate the checksum, while in the cpu flame
> graph
> [async-prof-pid-27827-cpu-3.svg|https://issues.apache.org/jira/secure/attachment/12970683/async-prof-pid-27827-cpu-3.svg],
> we can still see that about 32% CPU wasted on PureJavaCrc32#update, which
> means it's not using the faster hadoop native lib.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)