[ https://issues.apache.org/jira/browse/HBASE-22532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855447#comment-16855447 ]
Zheng Hu commented on HBASE-22532: ---------------------------------- OK, seems all of the block whose size > 65KB are BLOOM_CHUNK data type : {code} $ cat out.log | grep 'onDiskSizeWithoutHeader=1' | head -n 50 [blockType=BLOOM_CHUNK, fileOffset=16553869, headerSize=33, onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, prevBlockOffset=-1, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], dataBeginsWith=\xEF\x0Eg\xFC.>[\x0FLEF\xD9\xBF\x155\xC3Cdq=\x8E\xDB\xBDn\x0A\x17\x1DS\x08\xB7\xAC\xE2, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65690] [blockType=BLOOM_CHUNK, fileOffset=33172900, headerSize=33, onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, prevBlockOffset=16553869, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], dataBeginsWith=\xD7\xD9\x06\xF3\x16\xF4\x04\xAF5\xFD$\x0B\x8B\xFCC\xE8\xB1\xCE\x9Cb\x14?cDR|\xE9w\x9F\xCE4T, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65692] [blockType=BLOOM_CHUNK, fileOffset=49792391, headerSize=33, onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, prevBlockOffset=33172900, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], dataBeginsWith=%T\x18\xEC\x11\xB3\xFB\xCC\xC1\x0E\xEA\xCD\x11\x8E\xD67\xD1\xEB//lB\xB2aO}/\xFB\xEC\xF4\xB9\x1A, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65695] [blockType=BLOOM_CHUNK, fileOffset=66477402, headerSize=33, onDiskSizeWithoutHeader=131108, uncompressedSizeWithoutHeader=131072, prevBlockOffset=49792391, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=131105, getOnDiskSizeWithHeader=131141, totalChecksumBytes=36, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=131141, cap= 131174]], dataBeginsWith=\x0E\x00\xB0?>\xB5\xA8m\x0D\xAAY\x0EHc\xA1\xB4\xAD\xF9\xED\xC3\xB7gH\xB7\xCB&w\xFC\x8E\xA3\xE2\x0E, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=123a8968c0d641038c678117c3948bd6], nextBlockOnDiskSize=65686] {code} > There's still too much cpu wasting on validating checksum even if > buffer.size=65KB > ---------------------------------------------------------------------------------- > > Key: HBASE-22532 > URL: https://issues.apache.org/jira/browse/HBASE-22532 > Project: HBase > Issue Type: Sub-task > Reporter: Zheng Hu > Assignee: Zheng Hu > Priority: Major > Attachments: async-prof-pid-27827-cpu-3.svg, > async-prof-pid-64695-cpu-1.svg > > > After disabled the block cache, and with the following config: > {code} > # Disable the block cache > hfile.block.cache.size=0 > hbase.ipc.server.allocator.buffer.size=66560 > hbase.ipc.server.reservoir.minimal.allocating.size=0 > {code} > The ByteBuff for block should be expected to be a SingleByteBuff, which will > use the hadoop native lib to validate the checksum, while in the cpu flame > graph > [async-prof-pid-27827-cpu-3.svg|https://issues.apache.org/jira/secure/attachment/12970683/async-prof-pid-27827-cpu-3.svg], > we can still see that about 32% CPU wasted on PureJavaCrc32#update, which > means it's not using the faster hadoop native lib. -- This message was sent by Atlassian JIRA (v7.6.3#76005)