[jira] [Commented] (HBASE-26005) Update ref guide about the EOL for 2.2.x
[ https://issues.apache.org/jira/browse/HBASE-26005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365848#comment-17365848 ] Hudson commented on HBASE-26005: Results for branch master [build #327 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/327/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/327/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Update ref guide about the EOL for 2.2.x > > > Key: HBASE-26005 > URL: https://issues.apache.org/jira/browse/HBASE-26005 > Project: HBase > Issue Type: Sub-task > Components: documentation >Reporter: Duo Zhang >Assignee: Zhuoyue Huang >Priority: Major > Fix For: 3.0.0-alpha-1 > > > For example, remove the release manager for 2.2.x, and also update the > compatibility matrix with hadoop, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dingwei2019 updated HBASE-26016: Attachment: HBASE-26016-prettyPrintTool-1.patch > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > Attachments: HBASE-26016-prettyPrintTool-1.patch > > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dingwei2019 updated HBASE-26016: Description: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: was: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: _[blockType=DATA, fileOffset=246617457, headerSize=33, onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=0d519e7318414362a56e4f41bf63ccd4, cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], nextBlockOnDiskSize=66518]_ _[blockType=DATA, fileOffset=246683975, headerSize=33, onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=0d519e7318414362a56e4f41bf63ccd4, cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], nextBlockOnDiskSize=51744]_ _[blockType=DATA, fileOffset=246750493, headerSize=33, onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, getOnDiskSizeWithHeader=51744, totalChecksumBytes=16, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=51744, cap= 51777]], dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=0d519e7318414362a56e4f41bf63ccd4, cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], nextBlockOnDiskSize=20585]_ > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365842#comment-17365842 ] dingwei2019 edited comment on HBASE-26016 at 6/19/21, 3:51 AM: --- the problem is because the max offset is equal to the start of last datablock, but the last LEAF_INDEX and BLOOM_CHUNK is just behind of the last datablock. {color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_{color} {color:#172b4d} _max = trailer.getLastDataBlockOffset();_{color} {color:#172b4d}_HFileBlock block;_{color} {color:#172b4d}_while (offset <= max) {_{color} {color:#172b4d} _block = reader.readBlock(offset, -1, /* cacheBlock */ false, /* pread */ false,_{color} {color:#172b4d} _/* isCompaction */ false, /* updateCacheMetrics */ false, null, null);_{color} {color:#172b4d} _offset += block.getOnDiskSizeWithHeader();_{color} {color:#172b4d} _out.println(block);_{color} {color:#172b4d}_}_{color} {color:#172b4d}_so it was better to adjust max to the beginning of loadonopenblock. see the below:_{color} {color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_ _max = {color:#ff8b00}trailer.getLoadOnOpenDataOffset();{color}_ _HFileBlock block;_ {color:#ff8b00}_while (offset < max) {_{color} _block = reader.readBlock(offset, -1, /* cacheBlock */ false, /* pread */ false,_ _/* isCompaction */ false, /* updateCacheMetrics */ false, null, null);_ _offset += block.getOnDiskSizeWithHeader();_ _out.println(block);_ _}_{color} was (Author: dingwei2019): the problem is because the max offset is equal to the start of last datablock, but the last LEAF_INDEX and BLOOM_CHUNK is just behind of the last datablock. {color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_{color} {color:#172b4d} _max = trailer.getLastDataBlockOffset();_{color} {color:#172b4d}_HFileBlock block;_{color} {color:#172b4d}_while (offset <= max) {_{color} {color:#172b4d} _block = reader.readBlock(offset, -1, /* cacheBlock */ false, /* pread */ false,_{color} {color:#172b4d} _/* isCompaction */ false, /* updateCacheMetrics */ false, null, null);_{color} {color:#172b4d} _offset += block.getOnDiskSizeWithHeader();_{color} {color:#172b4d} _out.println(block);_{color} {color:#172b4d}_}_{color} > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > > _[blockType=DATA, fileOffset=246617457, headerSize=33, > onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, > prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, > getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, > buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], > dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE], > name=0d519e7318414362a56e4f41bf63ccd4, > cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], > nextBlockOnDiskSize=66518]_ > _[blockType=DATA, fileOffset=246683975, headerSize=33, > onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, > prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, > getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, > buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], > dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE], > name=0d519e7318414362a56e4f41bf63ccd4, > cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], > nextBlockOnDiskSize=51744]_ > _[blockType=DATA, fileOffset=246750493, headerSize=33, > onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, > prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, >
[jira] [Commented] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365842#comment-17365842 ] dingwei2019 commented on HBASE-26016: - the problem is because the max offset is equal to the start of last datablock, but the last LEAF_INDEX and BLOOM_CHUNK is just behind of the last datablock. {color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_{color} {color:#172b4d} _max = trailer.getLastDataBlockOffset();_{color} {color:#172b4d}_HFileBlock block;_{color} {color:#172b4d}_while (offset <= max) {_{color} {color:#172b4d} _block = reader.readBlock(offset, -1, /* cacheBlock */ false, /* pread */ false,_{color} {color:#172b4d} _/* isCompaction */ false, /* updateCacheMetrics */ false, null, null);_{color} {color:#172b4d} _offset += block.getOnDiskSizeWithHeader();_{color} {color:#172b4d} _out.println(block);_{color} {color:#172b4d}_}_{color} > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > > _[blockType=DATA, fileOffset=246617457, headerSize=33, > onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, > prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, > getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, > buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], > dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE], > name=0d519e7318414362a56e4f41bf63ccd4, > cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], > nextBlockOnDiskSize=66518]_ > _[blockType=DATA, fileOffset=246683975, headerSize=33, > onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, > prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, > getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, > buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], > dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE], > name=0d519e7318414362a56e4f41bf63ccd4, > cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], > nextBlockOnDiskSize=51744]_ > _[blockType=DATA, fileOffset=246750493, headerSize=33, > onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, > prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, > getOnDiskSizeWithHeader=51744, totalChecksumBytes=16, isUnpacked=true, > buf=[SingleByteBuff[pos=0, lim=51744, cap= 51777]], > dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE], > name=0d519e7318414362a56e4f41bf63ccd4, > cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], > nextBlockOnDiskSize=20585]_ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dingwei2019 updated HBASE-26016: Description: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: _[blockType=DATA, fileOffset=246617457, headerSize=33, onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=0d519e7318414362a56e4f41bf63ccd4, cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], nextBlockOnDiskSize=66518]_ _[blockType=DATA, fileOffset=246683975, headerSize=33, onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=0d519e7318414362a56e4f41bf63ccd4, cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], nextBlockOnDiskSize=51744]_ _[blockType=DATA, fileOffset=246750493, headerSize=33, onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, getOnDiskSizeWithHeader=51744, totalChecksumBytes=16, isUnpacked=true, buf=[SingleByteBuff[pos=0, lim=51744, cap= 51777]], dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, includesTags=false, compressAlgo=NONE, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE], name=0d519e7318414362a56e4f41bf63ccd4, cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], nextBlockOnDiskSize=20585]_ was: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > > _[blockType=DATA, fileOffset=246617457, headerSize=33, > onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, > prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, > getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, > buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], > dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE], > name=0d519e7318414362a56e4f41bf63ccd4, > cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], > nextBlockOnDiskSize=66518]_ > _[blockType=DATA, fileOffset=246683975, headerSize=33, > onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, > prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, > getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, > buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], >
[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dingwei2019 updated HBASE-26016: Description: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: was: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: !errPrint.PNG! > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dingwei2019 updated HBASE-26016: Description: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: !errPrint.PNG! was: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: !errPrint.PNG! > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > !errPrint.PNG! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
[ https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dingwei2019 updated HBASE-26016: Description: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: !errPrint.PNG! was: when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the attachment named errorPrint.jpg. > HFilePrettyPrinter tool can not print the last LEAF_INDEX block or > BLOOM_CHUNK. > --- > > Key: HBASE-26016 > URL: https://issues.apache.org/jira/browse/HBASE-26016 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.1.0, 2.3.2, 2.4.4 >Reporter: dingwei2019 >Assignee: dingwei2019 >Priority: Minor > > when i use pretty printer tool to print the headers of block, i can not get > the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow: > !errPrint.PNG! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.
dingwei2019 created HBASE-26016: --- Summary: HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK. Key: HBASE-26016 URL: https://issues.apache.org/jira/browse/HBASE-26016 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.4.4, 2.3.2, 2.1.0 Reporter: dingwei2019 Assignee: dingwei2019 when i use pretty printer tool to print the headers of block, i can not get the last LEAF_INDEX block and BLOOM_CHUNK. the attachment named errorPrint.jpg. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]
[ https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365837#comment-17365837 ] Hudson commented on HBASE-25984: Results for branch branch-2.4 [build #144 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > FSHLog WAL lockup with sync future reuse [RS deadlock] > -- > > Key: HBASE-25984 > URL: https://issues.apache.org/jira/browse/HBASE-25984 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Critical > Labels: deadlock, hang > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 1.7.1, 2.4.5 > > Attachments: HBASE-25984-unit-test.patch > > > We use FSHLog as the WAL implementation (branch-1 based) and under heavy load > we noticed the WAL system gets locked up due to a subtle bug involving racy > code with sync future reuse. This bug applies to all FSHLog implementations > across branches. > Symptoms: > On heavily loaded clusters with large write load we noticed that the region > servers are hanging abruptly with filled up handler queues and stuck MVCC > indicating appends/syncs not making any progress. > {noformat} > WARN [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, > regionName=1ce4003ab60120057734ffe367667dca} > WARN [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, > regionName=7c441d7243f9f504194dae6bf2622631} > {noformat} > All the handlers are stuck waiting for the sync futures and timing out. > {noformat} > java.lang.Object.wait(Native Method) > > org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509) > . > {noformat} > Log rolling is stuck because it was unable to attain a safe point > {noformat} >java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900) > {noformat} > and the Ring buffer consumer thinks that there are some outstanding syncs > that need to finish.. > {noformat} > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857) > {noformat} > On the other hand, SyncRunner threads are idle and just waiting for work > implying that there are no pending SyncFutures that need to be run > {noformat} >sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297) > java.lang.Thread.run(Thread.java:748) > {noformat} > Overall the WAL system is dead locked and could make no progress until it was > aborted. I got to the bottom of this issue and have a patch
[jira] [Assigned] (HBASE-26015) Should implement getRegionServers(boolean) method in AsyncAdmin
[ https://issues.apache.org/jira/browse/HBASE-26015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhuoyue Huang reassigned HBASE-26015: - Assignee: Zhuoyue Huang > Should implement getRegionServers(boolean) method in AsyncAdmin > --- > > Key: HBASE-26015 > URL: https://issues.apache.org/jira/browse/HBASE-26015 > Project: HBase > Issue Type: Task > Components: Admin, Client >Reporter: Duo Zhang >Assignee: Zhuoyue Huang >Priority: Major > > We have this method in Admin but not in AsyncAdmin, we should align these two > interfaces. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Vissapragada updated HBASE-25998: - Fix Version/s: 2.4.5 2.3.6 2.5.0 3.0.0-alpha-1 Resolution: Fixed Status: Resolved (was: Patch Available) > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5 > > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]
[ https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Vissapragada updated HBASE-25984: - Resolution: Fixed Status: Resolved (was: Patch Available) > FSHLog WAL lockup with sync future reuse [RS deadlock] > -- > > Key: HBASE-25984 > URL: https://issues.apache.org/jira/browse/HBASE-25984 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Critical > Labels: deadlock, hang > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 1.7.1, 2.4.5 > > Attachments: HBASE-25984-unit-test.patch > > > We use FSHLog as the WAL implementation (branch-1 based) and under heavy load > we noticed the WAL system gets locked up due to a subtle bug involving racy > code with sync future reuse. This bug applies to all FSHLog implementations > across branches. > Symptoms: > On heavily loaded clusters with large write load we noticed that the region > servers are hanging abruptly with filled up handler queues and stuck MVCC > indicating appends/syncs not making any progress. > {noformat} > WARN [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, > regionName=1ce4003ab60120057734ffe367667dca} > WARN [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, > regionName=7c441d7243f9f504194dae6bf2622631} > {noformat} > All the handlers are stuck waiting for the sync futures and timing out. > {noformat} > java.lang.Object.wait(Native Method) > > org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509) > . > {noformat} > Log rolling is stuck because it was unable to attain a safe point > {noformat} >java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900) > {noformat} > and the Ring buffer consumer thinks that there are some outstanding syncs > that need to finish.. > {noformat} > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857) > {noformat} > On the other hand, SyncRunner threads are idle and just waiting for work > implying that there are no pending SyncFutures that need to be run > {noformat} >sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297) > java.lang.Thread.run(Thread.java:748) > {noformat} > Overall the WAL system is dead locked and could make no progress until it was > aborted. I got to the bottom of this issue and have a patch that can fix it > (more details in the comments due to word limit in the description). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]
[ https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Vissapragada updated HBASE-25984: - Fix Version/s: 2.4.5 1.7.1 2.3.6 2.5.0 3.0.0-alpha-1 > FSHLog WAL lockup with sync future reuse [RS deadlock] > -- > > Key: HBASE-25984 > URL: https://issues.apache.org/jira/browse/HBASE-25984 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Critical > Labels: deadlock, hang > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 1.7.1, 2.4.5 > > Attachments: HBASE-25984-unit-test.patch > > > We use FSHLog as the WAL implementation (branch-1 based) and under heavy load > we noticed the WAL system gets locked up due to a subtle bug involving racy > code with sync future reuse. This bug applies to all FSHLog implementations > across branches. > Symptoms: > On heavily loaded clusters with large write load we noticed that the region > servers are hanging abruptly with filled up handler queues and stuck MVCC > indicating appends/syncs not making any progress. > {noformat} > WARN [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, > regionName=1ce4003ab60120057734ffe367667dca} > WARN [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, > regionName=7c441d7243f9f504194dae6bf2622631} > {noformat} > All the handlers are stuck waiting for the sync futures and timing out. > {noformat} > java.lang.Object.wait(Native Method) > > org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509) > . > {noformat} > Log rolling is stuck because it was unable to attain a safe point > {noformat} >java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900) > {noformat} > and the Ring buffer consumer thinks that there are some outstanding syncs > that need to finish.. > {noformat} > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857) > {noformat} > On the other hand, SyncRunner threads are idle and just waiting for work > implying that there are no pending SyncFutures that need to be run > {noformat} >sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297) > java.lang.Thread.run(Thread.java:748) > {noformat} > Overall the WAL system is dead locked and could make no progress until it was > aborted. I got to the bottom of this issue and have a patch that can fix it > (more details in the comments due to word limit in the description). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] bharathv merged pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
bharathv merged pull request #3398: URL: https://github.com/apache/hbase/pull/3398 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
Apache-HBase commented on pull request #3398: URL: https://github.com/apache/hbase/pull/3398#issuecomment-864327048 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 32s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ branch-1 Compile Tests _ | | +1 :green_heart: | mvninstall | 9m 50s | branch-1 passed | | +1 :green_heart: | compile | 0m 40s | branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19 | | +1 :green_heart: | compile | 0m 43s | branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10 | | +1 :green_heart: | checkstyle | 1m 43s | branch-1 passed | | +1 :green_heart: | shadedjars | 3m 5s | branch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 0m 48s | branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19 | | +1 :green_heart: | javadoc | 0m 41s | branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10 | | +0 :ok: | spotbugs | 3m 1s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 2m 58s | branch-1 passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 53s | the patch passed | | +1 :green_heart: | compile | 0m 41s | the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19 | | +1 :green_heart: | javac | 0m 41s | the patch passed | | +1 :green_heart: | compile | 0m 44s | the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10 | | +1 :green_heart: | javac | 0m 44s | the patch passed | | +1 :green_heart: | checkstyle | 1m 32s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedjars | 2m 49s | patch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | hadoopcheck | 4m 29s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2. | | +1 :green_heart: | javadoc | 0m 31s | the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19 | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10 | | +1 :green_heart: | findbugs | 2m 52s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 141m 52s | hbase-server in the patch passed. | | +1 :green_heart: | asflicense | 0m 39s | The patch does not generate ASF License warnings. | | | | 183m 5s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3398/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/3398 | | JIRA Issue | HBASE-25984 | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 68f3d4528118 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-agent/workspace/Base-PreCommit-GitHub-PR_PR-3398/out/precommit/personality/provided.sh | | git revision | branch-1 / a40f458 | | Default Java | Azul Systems, Inc.-1.7.0_272-b10 | | Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10 | | Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3398/3/testReport/ | | Max. process+thread count | 4305 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3398/3/console | | versions | git=1.9.1 maven=3.0.5 findbugs=3.0.1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
Apache-HBase commented on pull request #3398: URL: https://github.com/apache/hbase/pull/3398#issuecomment-863526314 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache9 commented on a change in pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue
Apache9 commented on a change in pull request #3397: URL: https://github.com/apache/hbase/pull/3397#discussion_r654542588 ## File path: hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java ## @@ -79,7 +84,13 @@ public String toString() { */ public static E takeWithoutInterrupt(final DelayQueue queue) { try { - return queue.take(); + E element = queue.poll(10, TimeUnit.SECONDS); + if (element == null && queue.size() > 0) { +LOG.error("DelayQueue is not empty when timed waiting elapsed. If this is repeated for" Review comment: This maybe too aggressive? Why choose 10 seconds as timeout here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv merged pull request #3400: HBASE-25998: Redo synchronization in SyncFuture
bharathv merged pull request #3400: URL: https://github.com/apache/hbase/pull/3400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue
Apache-HBase commented on pull request #3397: URL: https://github.com/apache/hbase/pull/3397#issuecomment-863442939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] tomscut removed a comment on pull request #3325: HBASE-25934 Add username for RegionScannerHolder
tomscut removed a comment on pull request #3325: URL: https://github.com/apache/hbase/pull/3325#issuecomment-856481326 Hi @saintstack , could you please take a look and merge the code? Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv opened a new pull request #3401: HBASE-25998: Redo synchronization in SyncFuture
bharathv opened a new pull request #3401: URL: https://github.com/apache/hbase/pull/3401 Currently uses coarse grained synchronized approach that seems to create a lot of contention. This patch - Uses a reentrant lock instead of synchronized monitor - Switches to a condition variable based waiting rather than busy wait - Removed synchronization for unnecessary fields Signed-off-by: Michael Stack Signed-off-by: Andrew Purtell Signed-off-by: Duo Zhang Signed-off-by: Viraj Jasani (cherry picked from commit 6bafb596421974717697b28d0856453245759c15) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv merged pull request #3394: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
bharathv merged pull request #3394: URL: https://github.com/apache/hbase/pull/3394 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] vli02 opened a new pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:
vli02 opened a new pull request #3402: URL: https://github.com/apache/hbase/pull/3402 hbck fixes and offlines some regions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv merged pull request #3382: HBASE-25998: Redo synchronization in SyncFuture
bharathv merged pull request #3382: URL: https://github.com/apache/hbase/pull/3382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3401: HBASE-25998: Redo synchronization in SyncFuture
Apache-HBase commented on pull request #3401: URL: https://github.com/apache/hbase/pull/3401#issuecomment-863548708 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3385: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid
Apache-HBase commented on pull request #3385: URL: https://github.com/apache/hbase/pull/3385#issuecomment-863451788 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache9 merged pull request #3388: HBASE-26005 Update ref guide about the EOL for 2.2.x
Apache9 merged pull request #3388: URL: https://github.com/apache/hbase/pull/3388 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv opened a new pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
bharathv opened a new pull request #3398: URL: https://github.com/apache/hbase/pull/3398 Signed-off-by: Viraj Jasani vjas...@apache.org (cherry picked from commit 5a19bcf) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3394: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
Apache-HBase commented on pull request #3394: URL: https://github.com/apache/hbase/pull/3394#issuecomment-863507742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] virajjasani opened a new pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue
virajjasani opened a new pull request #3397: URL: https://github.com/apache/hbase/pull/3397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache9 commented on pull request #3388: HBASE-26005 Update ref guide about the EOL for 2.2.x
Apache9 commented on pull request #3388: URL: https://github.com/apache/hbase/pull/3388#issuecomment-864139944 Oh, seems something wrong with the jenkins website, the table in the original ref guide is also empty. Let me merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Reidddddd commented on pull request #3387: HBASE-26004: port HBASE-26001 (cell level tags invisible in atomic operations when access control is on)to branch-1
Reidd commented on pull request #3387: URL: https://github.com/apache/hbase/pull/3387#issuecomment-863986392 Please fix the findbugs and checkstyle warnings @YutSean, thx -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3215: HBASE-25698 Fixing IllegalReferenceCountException when using TinyLfuBlockCache
Apache-HBase commented on pull request #3215: URL: https://github.com/apache/hbase/pull/3215#issuecomment-863959397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3392: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
Apache-HBase commented on pull request #3392: URL: https://github.com/apache/hbase/pull/3392#issuecomment-863506222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] YutSean opened a new pull request #3403: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid
YutSean opened a new pull request #3403: URL: https://github.com/apache/hbase/pull/3403 https://issues.apache.org/jira/browse/HBASE-26001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] virajjasani commented on a change in pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue
virajjasani commented on a change in pull request #3397: URL: https://github.com/apache/hbase/pull/3397#discussion_r654562565 ## File path: hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java ## @@ -79,7 +84,13 @@ public String toString() { */ public static E takeWithoutInterrupt(final DelayQueue queue) { try { - return queue.take(); + E element = queue.poll(10, TimeUnit.SECONDS); + if (element == null && queue.size() > 0) { +LOG.error("DelayQueue is not empty when timed waiting elapsed. If this is repeated for" Review comment: Since the default value of `hbase.procedure.remote.dispatcher.delay.msec` is just 150, I thought 10s might be enough. But I am open to keep it higher. What do you think would be better value? ## File path: hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java ## @@ -79,7 +84,13 @@ public String toString() { */ public static E takeWithoutInterrupt(final DelayQueue queue) { try { - return queue.take(); + E element = queue.poll(10, TimeUnit.SECONDS); + if (element == null && queue.size() > 0) { +LOG.error("DelayQueue is not empty when timed waiting elapsed. If this is repeated for" Review comment: Since the default value of `hbase.procedure.remote.dispatcher.delay.msec` is just 150, I thought 10s might be enough. But I am open to keep it higher. What do you think would be better value? Maybe 25/30s or 60s? ## File path: hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java ## @@ -79,7 +84,13 @@ public String toString() { */ public static E takeWithoutInterrupt(final DelayQueue queue) { try { - return queue.take(); + E element = queue.poll(10, TimeUnit.SECONDS); + if (element == null && queue.size() > 0) { +LOG.error("DelayQueue is not empty when timed waiting elapsed. If this is repeated for" Review comment: Since the default value of `hbase.procedure.remote.dispatcher.delay.msec` is just 150, I thought 10s might be enough. But I am open to keeping it higher. What do you think would be better value? Maybe 25/30s or 60s? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3403: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid
Apache-HBase commented on pull request #3403: URL: https://github.com/apache/hbase/pull/3403#issuecomment-863825776 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3400: HBASE-25998: Redo synchronization in SyncFuture
Apache-HBase commented on pull request #3400: URL: https://github.com/apache/hbase/pull/3400#issuecomment-863553502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] rda3mon commented on pull request #3359: HBASE-25891 remove dependence storing wal filenames for backup
rda3mon commented on pull request #3359: URL: https://github.com/apache/hbase/pull/3359#issuecomment-863812182 @Apache9 Can you review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv merged pull request #3401: HBASE-25998: Redo synchronization in SyncFuture
bharathv merged pull request #3401: URL: https://github.com/apache/hbase/pull/3401 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv merged pull request #3393: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
bharathv merged pull request #3393: URL: https://github.com/apache/hbase/pull/3393 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:
Apache-HBase commented on pull request #3402: URL: https://github.com/apache/hbase/pull/3402#issuecomment-863654404 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3387: HBASE-26004: port HBASE-26001 (cell level tags invisible in atomic operations when access control is on)to branch-1
Apache-HBase commented on pull request #3387: URL: https://github.com/apache/hbase/pull/3387#issuecomment-863981047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3360: HBASE-25975 Row Commit Sequencer
Apache-HBase commented on pull request #3360: URL: https://github.com/apache/hbase/pull/3360#issuecomment-863632737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] virajjasani commented on a change in pull request #3215: HBASE-25698 Fixing IllegalReferenceCountException when using TinyLfuBlockCache
virajjasani commented on a change in pull request #3215: URL: https://github.com/apache/hbase/pull/3215#discussion_r654285359 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java ## @@ -171,8 +177,10 @@ public Cacheable getBlock(BlockCacheKey cacheKey, if ((value != null) && caching) { if ((value instanceof HFileBlock) && ((HFileBlock) value).isSharedMem()) { value = HFileBlock.deepCloneOnHeap((HFileBlock) value); +cacheBlockUtil(cacheKey, value, true); Review comment: > U can do the deepclone in asReferencedHeapBlock() only based on isSharedMem right? retain() call is anyways needed LRUBlockCache does not perform block.retain() if block is cloned: ``` * 1. if cache the cloned heap block, its refCnt is an totally new one, it's easy to handle; * 2. if cache the original heap block, we're sure that it won't be tracked in ByteBuffAllocator's * reservoir, if both RPC and LRUBlockCache release the block, then it can be garbage collected by * JVM, so need a retain here. ``` ``` private Cacheable asReferencedHeapBlock(Cacheable buf) { if (buf instanceof HFileBlock) { HFileBlock blk = ((HFileBlock) buf); if (blk.isSharedMem()) { return HFileBlock.deepCloneOnHeap(blk); } } // The block will be referenced by this LRUBlockCache, so should increase its refCnt here. return buf.retain(); } ``` ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java ## @@ -171,8 +177,10 @@ public Cacheable getBlock(BlockCacheKey cacheKey, if ((value != null) && caching) { if ((value instanceof HFileBlock) && ((HFileBlock) value).isSharedMem()) { value = HFileBlock.deepCloneOnHeap((HFileBlock) value); +cacheBlockUtil(cacheKey, value, true); Review comment: @anoopsjohn This is simplified now. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java ## @@ -188,21 +196,58 @@ public void cacheBlock(BlockCacheKey cacheKey, Cacheable value, boolean inMemory @Override public void cacheBlock(BlockCacheKey key, Cacheable value) { +cacheBlockUtil(key, value, false); + } + + private void cacheBlockUtil(BlockCacheKey key, Cacheable value, boolean deepClonedOnHeap) { Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] d-c-manning commented on a change in pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:
d-c-manning commented on a change in pull request #3402: URL: https://github.com/apache/hbase/pull/3402#discussion_r654022634 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java ## @@ -1661,6 +1661,17 @@ public void regionOffline(final HRegionInfo regionInfo) { regionOffline(regionInfo, null); } + /** + * Marks the region as offline. In addition whether removing it from + * replicas and master in-memory server holding map. + * + * @param regionInfo + * @param force Review comment: let's add some descriptive text about why `force` would be used. Specifically that no known use case should have to use it except hbck, which desires to force a region offline and not have it ever be reopened on another server. ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java ## @@ -882,6 +882,80 @@ public void testDupeStartKey() throws Exception { assertNoErrors(hbck2); assertEquals(0, hbck2.getOverlapGroups(table).size()); assertEquals(ROWKEYS.length, countRows()); + + MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster(); + long totalRegions = cluster.countServedRegions(); + + // stop a region servers and run fsck again + cluster.stopRegionServer(server); + cluster.waitForRegionServerToStop(server, 60); + + // wait for all regions to come online. + while (cluster.countServedRegions() < totalRegions) { +try { + Thread.sleep(100); +} catch (InterruptedException e) {} Review comment: does this `InterruptedException` need to be caught? Can't the test method throw it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Reidddddd merged pull request #3385: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid
Reidd merged pull request #3385: URL: https://github.com/apache/hbase/pull/3385 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] anoopsjohn commented on a change in pull request #3215: HBASE-25698 Fixing IllegalReferenceCountException when using TinyLfuBlockCache
anoopsjohn commented on a change in pull request #3215: URL: https://github.com/apache/hbase/pull/3215#discussion_r654194628 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java ## @@ -171,8 +177,10 @@ public Cacheable getBlock(BlockCacheKey cacheKey, if ((value != null) && caching) { if ((value instanceof HFileBlock) && ((HFileBlock) value).isSharedMem()) { value = HFileBlock.deepCloneOnHeap((HFileBlock) value); +cacheBlockUtil(cacheKey, value, true); Review comment: Pls refer the code in LRUBlockCache. U can do the deepclone in asReferencedHeapBlock() only based on isSharedMem right? retain() call is anyways needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] YutSean opened a new pull request #3404: HBASE-26013 Get operations readRows metrics becomes zero after HBASE-25677
YutSean opened a new pull request #3404: URL: https://github.com/apache/hbase/pull/3404 https://issues.apache.org/jira/browse/HBASE-26013 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3399: HBASE-25998: Redo synchronization in SyncFuture
Apache-HBase commented on pull request #3399: URL: https://github.com/apache/hbase/pull/3399#issuecomment-863541666 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3404: HBASE-26013 Get operations readRows metrics becomes zero after HBASE-25677
Apache-HBase commented on pull request #3404: URL: https://github.com/apache/hbase/pull/3404#issuecomment-863935433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] vli02 commented on a change in pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:
vli02 commented on a change in pull request #3402: URL: https://github.com/apache/hbase/pull/3402#discussion_r654037619 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java ## @@ -1661,6 +1661,17 @@ public void regionOffline(final HRegionInfo regionInfo) { regionOffline(regionInfo, null); } + /** + * Marks the region as offline. In addition whether removing it from + * replicas and master in-memory server holding map. + * + * @param regionInfo + * @param force Review comment: updated. thanks! ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java ## @@ -882,6 +882,80 @@ public void testDupeStartKey() throws Exception { assertNoErrors(hbck2); assertEquals(0, hbck2.getOverlapGroups(table).size()); assertEquals(ROWKEYS.length, countRows()); + + MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster(); + long totalRegions = cluster.countServedRegions(); + + // stop a region servers and run fsck again + cluster.stopRegionServer(server); + cluster.waitForRegionServerToStop(server, 60); + + // wait for all regions to come online. + while (cluster.countServedRegions() < totalRegions) { +try { + Thread.sleep(100); +} catch (InterruptedException e) {} Review comment: Sleep can be waken up by interruption before time is up, that should be normal, I want it not to break the loop until we finished waiting for all regions to come up. Thread.sleep() method is defined this way: https://www.tutorialspoint.com/java/lang/thread_sleep_millis.htm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #2941: HBASE-21674:Port HBASE-21652 (Refactor ThriftServer making thrift2 server inherited from thrift1 server) to branch-1
Apache-HBase commented on pull request #2941: URL: https://github.com/apache/hbase/pull/2941#issuecomment-864010643 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 21s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 2s | No case conflicting files found. | | +0 :ok: | jshint | 0m 0s | jshint was not available. | | +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 9 new or modified test files. | ||| _ branch-1 Compile Tests _ | | +0 :ok: | mvndep | 2m 25s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 8m 8s | branch-1 passed | | +1 :green_heart: | compile | 0m 49s | branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19 | | +1 :green_heart: | compile | 0m 55s | branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10 | | +1 :green_heart: | checkstyle | 1m 10s | branch-1 passed | | -1 :x: | shadedjars | 0m 18s | branch has 7 errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 0m 56s | branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19 | | +1 :green_heart: | javadoc | 2m 13s | branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10 | | +0 :ok: | spotbugs | 1m 53s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 17s | branch-1 passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 18s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 4s | the patch passed | | +1 :green_heart: | compile | 0m 48s | the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19 | | +1 :green_heart: | javac | 0m 48s | the patch passed | | +1 :green_heart: | compile | 0m 57s | the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10 | | -1 :x: | javac | 0m 35s | hbase-thrift-jdkAzulSystems,Inc.-1.7.0_272-b10 with JDK Azul Systems, Inc.-1.7.0_272-b10 generated 44 new + 63 unchanged - 41 fixed = 107 total (was 104) | | -1 :x: | checkstyle | 0m 36s | hbase-thrift: The patch generated 2 new + 82 unchanged - 77 fixed = 84 total (was 159) | | +1 :green_heart: | whitespace | 0m 1s | The patch has no whitespace issues. | | -1 :x: | xml | 0m 0s | The patch has 1 ill-formed XML file(s). | | -1 :x: | shadedjars | 0m 12s | patch has 7 errors when building our shaded downstream artifacts. | | +1 :green_heart: | hadoopcheck | 4m 50s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2. | | -1 :x: | javadoc | 0m 32s | hbase-thrift-jdkAzulSystems,Inc.-1.8.0_262-b19 with JDK Azul Systems, Inc.-1.8.0_262-b19 generated 13 new + 0 unchanged - 0 fixed = 13 total (was 0) | | -1 :x: | javadoc | 3m 20s | hbase-thrift-jdkAzulSystems,Inc.-1.7.0_272-b10 with JDK Azul Systems, Inc.-1.7.0_272-b10 generated 13 new + 0 unchanged - 0 fixed = 13 total (was 0) | | +1 :green_heart: | findbugs | 3m 36s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 2m 45s | hbase-common in the patch passed. | | -1 :x: | unit | 0m 37s | hbase-thrift in the patch failed. | | +1 :green_heart: | asflicense | 0m 24s | The patch does not generate ASF License warnings. | | | | 48m 49s | | | Reason | Tests | |---:|:--| | XML | Parsing Error(s): | | | dev-support/hbase_eclipse_formatter.xml | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2941/15/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2941 | | JIRA Issue | HBASE-21674 | | Optional Tests | dupname asflicense xml javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile jshint | | uname | Linux e825992bfaaf 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-2941/out/precommit/personality/provided.sh | | git revision | branch-1 / a40f458 | | Default Java | Azul Systems, Inc.-1.7.0_272-b10 | | Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10 | | shadedjars |
[GitHub] [hbase] bharathv merged pull request #3399: HBASE-25998: Redo synchronization in SyncFuture
bharathv merged pull request #3399: URL: https://github.com/apache/hbase/pull/3399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv opened a new pull request #3400: HBASE-25998: Redo synchronization in SyncFuture
bharathv opened a new pull request #3400: URL: https://github.com/apache/hbase/pull/3400 Currently uses coarse grained synchronized approach that seems to create a lot of contention. This patch - Uses a reentrant lock instead of synchronized monitor - Switches to a condition variable based waiting rather than busy wait - Removed synchronization for unnecessary fields Signed-off-by: Michael Stack Signed-off-by: Andrew Purtell Signed-off-by: Duo Zhang Signed-off-by: Viraj Jasani (cherry picked from commit 6bafb596421974717697b28d0856453245759c15) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv opened a new pull request #3399: HBASE-25998: Redo synchronization in SyncFuture
bharathv opened a new pull request #3399: URL: https://github.com/apache/hbase/pull/3399 Currently uses coarse grained synchronized approach that seems to create a lot of contention. This patch - Uses a reentrant lock instead of synchronized monitor - Switches to a condition variable based waiting rather than busy wait - Removed synchronization for unnecessary fields Signed-off-by: Michael Stack Signed-off-by: Andrew Purtell Signed-off-by: Duo Zhang Signed-off-by: Viraj Jasani (cherry picked from commit 6bafb596421974717697b28d0856453245759c15) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv merged pull request #3392: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
bharathv merged pull request #3392: URL: https://github.com/apache/hbase/pull/3392 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #3393: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
Apache-HBase commented on pull request #3393: URL: https://github.com/apache/hbase/pull/3393#issuecomment-863506798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-26001) When turn on access control, the cell level TTL of Increment and Append operations is invalid.
[ https://issues.apache.org/jira/browse/HBASE-26001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365693#comment-17365693 ] Hudson commented on HBASE-26001: Results for branch master [build #326 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > When turn on access control, the cell level TTL of Increment and Append > operations is invalid. > -- > > Key: HBASE-26001 > URL: https://issues.apache.org/jira/browse/HBASE-26001 > Project: HBase > Issue Type: Bug > Components: Coprocessors >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > AccessController postIncrementBeforeWAL() and postAppendBeforeWAL() methods > will rewrite the new cell's tags by the old cell's. This will makes the other > kinds of tag in new cell invisible (such as TTL tag) after this. As in > Increment and Append operations, the new cell has already catch forward all > tags of the old cell and TTL tag from mutation operation, here in > AccessController we do not need to rewrite the tags once again. Also, the TTL > tag of newCell will be invisible in the new created cell. Actually, in > Increment and Append operations, the newCell has already copied all tags of > the oldCell. So the oldCell is useless here. > {code:java} > private Cell createNewCellWithTags(Mutation mutation, Cell oldCell, Cell > newCell) { > // Collect any ACLs from the old cell > List tags = Lists.newArrayList(); > List aclTags = Lists.newArrayList(); > ListMultimap perms = ArrayListMultimap.create(); > if (oldCell != null) { > Iterator tagIterator = PrivateCellUtil.tagsIterator(oldCell); > while (tagIterator.hasNext()) { > Tag tag = tagIterator.next(); > if (tag.getType() != PermissionStorage.ACL_TAG_TYPE) { > // Not an ACL tag, just carry it through > if (LOG.isTraceEnabled()) { > LOG.trace("Carrying forward tag from " + oldCell + ": type " + > tag.getType() > + " length " + tag.getValueLength()); > } > tags.add(tag); > } else { > aclTags.add(tag); > } > } > } > // Do we have an ACL on the operation? > byte[] aclBytes = mutation.getACL(); > if (aclBytes != null) { > // Yes, use it > tags.add(new ArrayBackedTag(PermissionStorage.ACL_TAG_TYPE, aclBytes)); > } else { > // No, use what we carried forward > if (perms != null) { > // TODO: If we collected ACLs from more than one tag we may have a > // List of size > 1, this can be collapsed into a single > // Permission > if (LOG.isTraceEnabled()) { > LOG.trace("Carrying forward ACLs from " + oldCell + ": " + perms); > } > tags.addAll(aclTags); > } > } > // If we have no tags to add, just return > if (tags.isEmpty()) { > return newCell; > } > // Here the new cell's tags will be in visible. > return PrivateCellUtil.createCell(newCell, tags); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365692#comment-17365692 ] Hudson commented on HBASE-25998: Results for branch master [build #326 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25976) Implement a master based ReplicationTracker
[ https://issues.apache.org/jira/browse/HBASE-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365691#comment-17365691 ] Hudson commented on HBASE-25976: Results for branch master [build #326 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Implement a master based ReplicationTracker > --- > > Key: HBASE-25976 > URL: https://issues.apache.org/jira/browse/HBASE-25976 > Project: HBase > Issue Type: Sub-task > Components: Replication >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0 > > > Now the only thing we care about is the live region servers and we can get > this information from master, so let's do it to remove the dependencies on > zookeeper. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]
[ https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365648#comment-17365648 ] Hudson commented on HBASE-25984: Results for branch branch-2 [build #279 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} -- Something went wrong with this stage, [check relevant console output|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279//console]. > FSHLog WAL lockup with sync future reuse [RS deadlock] > -- > > Key: HBASE-25984 > URL: https://issues.apache.org/jira/browse/HBASE-25984 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Critical > Labels: deadlock, hang > Attachments: HBASE-25984-unit-test.patch > > > We use FSHLog as the WAL implementation (branch-1 based) and under heavy load > we noticed the WAL system gets locked up due to a subtle bug involving racy > code with sync future reuse. This bug applies to all FSHLog implementations > across branches. > Symptoms: > On heavily loaded clusters with large write load we noticed that the region > servers are hanging abruptly with filled up handler queues and stuck MVCC > indicating appends/syncs not making any progress. > {noformat} > WARN [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, > regionName=1ce4003ab60120057734ffe367667dca} > WARN [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, > regionName=7c441d7243f9f504194dae6bf2622631} > {noformat} > All the handlers are stuck waiting for the sync futures and timing out. > {noformat} > java.lang.Object.wait(Native Method) > > org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509) > . > {noformat} > Log rolling is stuck because it was unable to attain a safe point > {noformat} >java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900) > {noformat} > and the Ring buffer consumer thinks that there are some outstanding syncs > that need to finish.. > {noformat} > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857) > {noformat} > On the other hand, SyncRunner threads are idle and just waiting for work > implying that there are no pending SyncFutures that need to be run > {noformat} >sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297) > java.lang.Thread.run(Thread.java:748) > {noformat} > Overall the WAL system is dead locked and could make no progress
[jira] [Commented] (HBASE-25976) Implement a master based ReplicationTracker
[ https://issues.apache.org/jira/browse/HBASE-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365647#comment-17365647 ] Hudson commented on HBASE-25976: Results for branch branch-2 [build #279 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} -- Something went wrong with this stage, [check relevant console output|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279//console]. > Implement a master based ReplicationTracker > --- > > Key: HBASE-25976 > URL: https://issues.apache.org/jira/browse/HBASE-25976 > Project: HBase > Issue Type: Sub-task > Components: Replication >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0 > > > Now the only thing we care about is the live region servers and we can get > this information from master, so let's do it to remove the dependencies on > zookeeper. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?
[ https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637 ] Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 6:05 PM: Hi, we experienced the same issue in HBase 2.3.4 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.2.1. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 30 ms for txid=54056852, WAL system stuck? (...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} In HA the Hmasters started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. was (Author: tr0k): Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 30 ms for txid=54056852, WAL system stuck? (...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} In HA the Hmasters started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. > [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL > system stuck? > - > > Key: HBASE-20503 > URL: https://issues.apache.org/jira/browse/HBASE-20503 > Project: HBase > Issue Type: Bug > Components: wal >Reporter: Michael Stack >Priority: Major > Attachments: > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch > > > Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to > furiously open regions assigned by Master. It is importantly carrying > hbase:meta. Twenty minutes in, meta goes dead after an exception up out > AsyncFSWAL. Process had been restarted so I couldn't get a thread dump. >
[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?
[ https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637 ] Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 6:04 PM: Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 30 ms for txid=54056852, WAL system stuck? (...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} In HA the Hmasters started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. was (Author: tr0k): Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 30 ms for txid=54056852, WAL system stuck?(...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} In HA the Hmasters started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. > [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL > system stuck? > - > > Key: HBASE-20503 > URL: https://issues.apache.org/jira/browse/HBASE-20503 > Project: HBase > Issue Type: Bug > Components: wal >Reporter: Michael Stack >Priority: Major > Attachments: > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch > > > Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to > furiously open regions assigned by Master. It is importantly carrying > hbase:meta. Twenty minutes in, meta goes dead after an exception up out > AsyncFSWAL. Process had been restarted so I couldn't get a thread dump. >
[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?
[ https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637 ] Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 6:02 PM: Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 30 ms for txid=54056852, WAL system stuck?(...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} In HA the Hmasters started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. was (Author: tr0k): Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : (...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} In HA the Hmasters started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. > [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL > system stuck? > - > > Key: HBASE-20503 > URL: https://issues.apache.org/jira/browse/HBASE-20503 > Project: HBase > Issue Type: Bug > Components: wal >Reporter: Michael Stack >Priority: Major > Attachments: > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch > > > Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to > furiously open regions assigned by Master. It is importantly carrying > hbase:meta. Twenty minutes in, meta goes dead after an exception up out > AsyncFSWAL. Process had been restarted so I couldn't get a thread dump. > Suspicious is we archive a WAL and we get a FNFE because we got to access WAL > in old location. [~Apache9] mind taking a look? Does this
[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?
[ https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637 ] Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 5:38 PM: Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : (...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} In HA the Hmasters started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. was (Author: tr0k): Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : (...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {{}} {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} {{}}In HA they started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. > [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL > system stuck? > - > > Key: HBASE-20503 > URL: https://issues.apache.org/jira/browse/HBASE-20503 > Project: HBase > Issue Type: Bug > Components: wal >Reporter: Michael Stack >Priority: Major > Attachments: > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch > > > Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to > furiously open regions assigned by Master. It is importantly carrying > hbase:meta. Twenty minutes in, meta goes dead after an exception up out > AsyncFSWAL. Process had been restarted so I couldn't get a thread dump. > Suspicious is we archive a WAL and we get a FNFE because we got to access WAL > in old location. [~Apache9] mind taking a look? Does this FNFE rolling kill > the WAL sub-system? Thanks. > DFS complaining on file open for a few files getting blocks from remote dead > DNs:
[jira] [Commented] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?
[ https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637 ] Emil Kleszcz commented on HBASE-20503: -- Hi, we experienced the same issue in HBase 2.3.1 on one of our production clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 where we never observed this problem. We run on the HDP 3.1.2. On average we have around 800 regions per RS and the workload was, as usual, these days. This problem started on one of the RSs where meta region residing. We could observe the following in the RS log: {code:java} <2021-06-15T10:31:28.284+0200> : : java.io.IOException: stream already broken at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420) at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509) (...) <2021-06-15T11:11:39.744+0200> : java.io.FileNotFoundException: File does not exist: /hbase/WALs/ (...) <2021-06-15T11:15:59.241+0200> : : java.io.IOException: stream already broken (...) <2021-06-15T11:39:39.986+0200> : (...) <2021-06-16T18:43:05.220+0200> : {code} Since then compaction started failing on many regions including meta. 2 days later we could see one RS going down... This triggered an avalanche of stuck procedures in HMaster {{}} {code:java} <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.862+0200> : <2021-06-17T08:53:13.866+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:13.867+0200> : <2021-06-17T08:53:28.443+0200> : {code} {{}}In HA they started flipping over and we could observe more and more RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). Only the manual fix (forcing states for tables/regions) helped to recover the cluster. I hope you apply the working patch soon. > [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL > system stuck? > - > > Key: HBASE-20503 > URL: https://issues.apache.org/jira/browse/HBASE-20503 > Project: HBase > Issue Type: Bug > Components: wal >Reporter: Michael Stack >Priority: Major > Attachments: > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, > 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch > > > Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to > furiously open regions assigned by Master. It is importantly carrying > hbase:meta. Twenty minutes in, meta goes dead after an exception up out > AsyncFSWAL. Process had been restarted so I couldn't get a thread dump. > Suspicious is we archive a WAL and we get a FNFE because we got to access WAL > in old location. [~Apache9] mind taking a look? Does this FNFE rolling kill > the WAL sub-system? Thanks. > DFS complaining on file open for a few files getting blocks from remote dead > DNs: e.g. {{2018-04-25 10:05:21,506 WARN > org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing > remote block reader. > java.net.ConnectException: Connection refused}} > AsyncFSWAL complaining: "AbstractFSWAL: Slow sync cost: 103 ms" . > About ten minutes in, we get this: > {code} > 2018-04-25 10:15:16,532 WARN > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL: sync failed > java.io.IOException: stream already broken > at > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:424) > at > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:513) > > > > at > org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.sync(AsyncProtobufLogWriter.java:134) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:364) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.consume(AsyncFSWAL.java:547) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2018-04-25 10:15:16,680 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Rolled WAL > /hbase/WALs/vc0205.halxg.cloudera.com,22101,1524675808073/vc0205.halxg.cloudera.com%2C22101%2C1524675808073.meta.1524676253923.meta > with entries=10819, filesize=7.57 MB; new WAL >
[jira] [Commented] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]
[ https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365629#comment-17365629 ] Hudson commented on HBASE-25984: Results for branch branch-2.3 [build #239 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > FSHLog WAL lockup with sync future reuse [RS deadlock] > -- > > Key: HBASE-25984 > URL: https://issues.apache.org/jira/browse/HBASE-25984 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Critical > Labels: deadlock, hang > Attachments: HBASE-25984-unit-test.patch > > > We use FSHLog as the WAL implementation (branch-1 based) and under heavy load > we noticed the WAL system gets locked up due to a subtle bug involving racy > code with sync future reuse. This bug applies to all FSHLog implementations > across branches. > Symptoms: > On heavily loaded clusters with large write load we noticed that the region > servers are hanging abruptly with filled up handler queues and stuck MVCC > indicating appends/syncs not making any progress. > {noformat} > WARN [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, > regionName=1ce4003ab60120057734ffe367667dca} > WARN [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - > STUCK for : 296000 millis. > MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, > regionName=7c441d7243f9f504194dae6bf2622631} > {noformat} > All the handlers are stuck waiting for the sync futures and timing out. > {noformat} > java.lang.Object.wait(Native Method) > > org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509) > . > {noformat} > Log rolling is stuck because it was unable to attain a safe point > {noformat} >java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900) > {noformat} > and the Ring buffer consumer thinks that there are some outstanding syncs > that need to finish.. > {noformat} > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857) > {noformat} > On the other hand, SyncRunner threads are idle and just waiting for work > implying that there are no pending SyncFutures that need to be run > {noformat} >sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297) > java.lang.Thread.run(Thread.java:748) > {noformat} > Overall the WAL system is dead locked and could make no progress until it was > aborted. I got to the bottom of this issue and have a patch that can fix it > (more details in the comments due to word limit in the
[jira] [Commented] (HBASE-11408) "multiple SLF4J bindings" warning messages when running HBase shell
[ https://issues.apache.org/jira/browse/HBASE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365607#comment-17365607 ] Stefan Miklosovic commented on HBASE-11408: --- It helps when you set this in hbase-env.sh {code:java} # Tell HBase whether it should include Hadoop's lib when start up, # the default value is false,means that includes Hadoop's lib. export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" {code} This will exclude Hadoops CP to be added to HBase CP. I am not sure what is having Hadoops CP good for ... it seems to me like all works as before > "multiple SLF4J bindings" warning messages when running HBase shell > --- > > Key: HBASE-11408 > URL: https://issues.apache.org/jira/browse/HBASE-11408 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.2, 0.98.3 >Reporter: Duo Xu >Priority: Minor > > When running hbase shell, we saw warnings like this: > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/C:/apps/dist/hbase-0.98.0.2.1.3.0-1928-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/C:/apps/dist/hadoop-2.4.0.2.1.3.0-1928/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-11408) "multiple SLF4J bindings" warning messages when running HBase shell
[ https://issues.apache.org/jira/browse/HBASE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365607#comment-17365607 ] Stefan Miklosovic edited comment on HBASE-11408 at 6/18/21, 4:50 PM: - It helps when you set this in hbase-env.sh (holds for 2.2.6) {code:java} # Tell HBase whether it should include Hadoop's lib when start up, # the default value is false,means that includes Hadoop's lib. export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" {code} This will exclude Hadoops CP to be added to HBase CP. I am not sure what is having Hadoops CP good for ... it seems to me like all works as before was (Author: stefan.miklosovic): It helps when you set this in hbase-env.sh {code:java} # Tell HBase whether it should include Hadoop's lib when start up, # the default value is false,means that includes Hadoop's lib. export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" {code} This will exclude Hadoops CP to be added to HBase CP. I am not sure what is having Hadoops CP good for ... it seems to me like all works as before > "multiple SLF4J bindings" warning messages when running HBase shell > --- > > Key: HBASE-11408 > URL: https://issues.apache.org/jira/browse/HBASE-11408 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.2, 0.98.3 >Reporter: Duo Xu >Priority: Minor > > When running hbase shell, we saw warnings like this: > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/C:/apps/dist/hbase-0.98.0.2.1.3.0-1928-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/C:/apps/dist/hadoop-2.4.0.2.1.3.0-1928/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-26005) Update ref guide about the EOL for 2.2.x
[ https://issues.apache.org/jira/browse/HBASE-26005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-26005. --- Hadoop Flags: Reviewed Resolution: Fixed Merged to master. Thanks [~GeorryHuang] for contributing. > Update ref guide about the EOL for 2.2.x > > > Key: HBASE-26005 > URL: https://issues.apache.org/jira/browse/HBASE-26005 > Project: HBase > Issue Type: Sub-task > Components: documentation >Reporter: Duo Zhang >Assignee: Zhuoyue Huang >Priority: Major > Fix For: 3.0.0-alpha-1 > > > For example, remove the release manager for 2.2.x, and also update the > compatibility matrix with hadoop, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26005) Update ref guide about the EOL for 2.2.x
[ https://issues.apache.org/jira/browse/HBASE-26005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-26005: -- Fix Version/s: 3.0.0-alpha-1 > Update ref guide about the EOL for 2.2.x > > > Key: HBASE-26005 > URL: https://issues.apache.org/jira/browse/HBASE-26005 > Project: HBase > Issue Type: Sub-task > Components: documentation >Reporter: Duo Zhang >Assignee: Zhuoyue Huang >Priority: Major > Fix For: 3.0.0-alpha-1 > > > For example, remove the release manager for 2.2.x, and also update the > compatibility matrix with hadoop, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-26015) Should implement getRegionServers(boolean) method in AsyncAdmin
Duo Zhang created HBASE-26015: - Summary: Should implement getRegionServers(boolean) method in AsyncAdmin Key: HBASE-26015 URL: https://issues.apache.org/jira/browse/HBASE-26015 Project: HBase Issue Type: Task Components: Admin, Client Reporter: Duo Zhang We have this method in Admin but not in AsyncAdmin, we should align these two interfaces. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26014) ServiceLoader usages should not be tied to Thread Context Classloader
[ https://issues.apache.org/jira/browse/HBASE-26014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Lopukhov updated HBASE-26014: Summary: ServiceLoader usages should not be tied to Thread Context Classloader (was: ServiceLoader usages are tied to Thread Context Classloader) > ServiceLoader usages should not be tied to Thread Context Classloader > - > > Key: HBASE-26014 > URL: https://issues.apache.org/jira/browse/HBASE-26014 > Project: HBase > Issue Type: Improvement >Reporter: Andrei Lopukhov >Priority: Major > > Classes which uses ServiceLoader facility does not specify ClassLoader to use. > For hbase-client 2.4.1 they are (at least): > * SaslClientAuthenticationProviders > * MetricRegistries > When hbase libraries are loaded dynamically and Thread Context Classloader is > not set SaslClientAuthenticationProviders instantiation failes becuse it > can't find default providers. > Some proposals for classloader selection strategy (usage dependent I guess): > * Use classloader specified in Configuration instance. > * Use classloader which loaded specific hbase class > * Combine them: use classloader from Configuration if present and fallback > to classloader which loaded specific hbase class > Real world requirement example: currently we migrating from hbase 1 to hbase > 2. For better compatibility and smooth migration we try to build abstraction > around hbase client libraries and isolate them with custom classloaders. To > workaround problems with context classloader we must either wrap all calls to > use proper contextclassloader or explicitly trigger initialization of > affected classes (SaslClientAuthenticationProviders) wich are marked private > under proper context classloader. It would be better if hbase client will > take care of it itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26014) ServiceLoader usages are tied to Thread Context Classloader
[ https://issues.apache.org/jira/browse/HBASE-26014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Lopukhov updated HBASE-26014: Description: Classes which uses ServiceLoader facility does not specify ClassLoader to use. For hbase-client 2.4.1 they are (at least): * SaslClientAuthenticationProviders * MetricRegistries When hbase libraries are loaded dynamically and Thread Context Classloader is not set SaslClientAuthenticationProviders instantiation failes becuse it can't find default providers. Some proposals for classloader selection strategy (usage dependent I guess): * Use classloader specified in Configuration instance. * Use classloader which loaded specific hbase class * Combine them: use classloader from Configuration if present and fallback to classloader which loaded specific hbase class Real world requirement example: currently we migrating from hbase 1 to hbase 2. For better compatibility and smooth migration we try to build abstraction around hbase client libraries and isolate them with custom classloaders. To workaround problems with context classloader we must either wrap all calls to use proper contextclassloader or explicitly trigger initialization of affected classes (SaslClientAuthenticationProviders) wich are marked private under proper context classloader. It would be better if hbase client will take care of it itself. was: Classes which uses ServiceLoader facility does not specify ClassLoader to use. For hbase-client 2.4.1 they are (at least): * SaslClientAuthenticationProviders * MetricRegistries When hbase libraries are loaded dynamically and Thread Context Classloader is not set SaslClientAuthenticationProviders instantiation failes becuse it can't find default providers. Some proposals for classloader selection strategy (usage dependent I guess): * Use classloader specified in Configuration instance. * Use classloader which loaded specific hbase class * Combine them: use classloader from Configuration if present and fallback to classloader which loaded specific hbase class Real world example: currently we migrating from hbase 1 to hbase 2. For better compatibility and smooth migration we try to build abstraction around hbase client libraries and isolate them with custom classloaders. > ServiceLoader usages are tied to Thread Context Classloader > --- > > Key: HBASE-26014 > URL: https://issues.apache.org/jira/browse/HBASE-26014 > Project: HBase > Issue Type: Improvement >Reporter: Andrei Lopukhov >Priority: Major > > Classes which uses ServiceLoader facility does not specify ClassLoader to use. > For hbase-client 2.4.1 they are (at least): > * SaslClientAuthenticationProviders > * MetricRegistries > When hbase libraries are loaded dynamically and Thread Context Classloader is > not set SaslClientAuthenticationProviders instantiation failes becuse it > can't find default providers. > Some proposals for classloader selection strategy (usage dependent I guess): > * Use classloader specified in Configuration instance. > * Use classloader which loaded specific hbase class > * Combine them: use classloader from Configuration if present and fallback > to classloader which loaded specific hbase class > Real world requirement example: currently we migrating from hbase 1 to hbase > 2. For better compatibility and smooth migration we try to build abstraction > around hbase client libraries and isolate them with custom classloaders. To > workaround problems with context classloader we must either wrap all calls to > use proper contextclassloader or explicitly trigger initialization of > affected classes (SaslClientAuthenticationProviders) wich are marked private > under proper context classloader. It would be better if hbase client will > take care of it itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-26014) ServiceLoader usages are tied to Thread Context Classloader
Andrei Lopukhov created HBASE-26014: --- Summary: ServiceLoader usages are tied to Thread Context Classloader Key: HBASE-26014 URL: https://issues.apache.org/jira/browse/HBASE-26014 Project: HBase Issue Type: Improvement Reporter: Andrei Lopukhov Classes which uses ServiceLoader facility does not specify ClassLoader to use. For hbase-client 2.4.1 they are (at least): * SaslClientAuthenticationProviders * MetricRegistries When hbase libraries are loaded dynamically and Thread Context Classloader is not set SaslClientAuthenticationProviders instantiation failes becuse it can't find default providers. Some proposals for classloader selection strategy (usage dependent I guess): * Use classloader specified in Configuration instance. * Use classloader which loaded specific hbase class * Combine them: use classloader from Configuration if present and fallback to classloader which loaded specific hbase class Real world example: currently we migrating from hbase 1 to hbase 2. For better compatibility and smooth migration we try to build abstraction around hbase client libraries and isolate them with custom classloaders. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677
[ https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26013: Description: After HBASE-25677, Server+table counters on each scan are extracted from #nextRaw to rsServices scan. In this case, the get operation will not count the read rows. So that the readRows metrics becomes zero. Should add counter in metricsUpdateForGet. (was: After HBASE-25677, Server+table counters on each scan are extracted from #nextRaw to rsServices level scan. In this case, the get operation will not count the read rows. So that the readRows metrics becomes zero. Should add counter in metricsUpdateForGet.) > Get operations readRows metrics becomes zero after HBASE-25677 > -- > > Key: HBASE-26013 > URL: https://issues.apache.org/jira/browse/HBASE-26013 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > After HBASE-25677, Server+table counters on each scan are extracted from > #nextRaw to rsServices scan. In this case, the get operation will not count > the read rows. So that the readRows metrics becomes zero. Should add counter > in metricsUpdateForGet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677
[ https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26013: Description: After HBASE-25677, Server+table counters on each scan are extracted from #nextRaw to rsServices level scan. In this case, the get operation will not count the read rows. So that the readRows metrics becomes zero. Should add counter in metricsUpdateForGet. (was: ) > Get operations readRows metrics becomes zero after HBASE-25677 > -- > > Key: HBASE-26013 > URL: https://issues.apache.org/jira/browse/HBASE-26013 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > After HBASE-25677, Server+table counters on each scan are extracted from > #nextRaw to rsServices level scan. In this case, the get operation will not > count the read rows. So that the readRows metrics becomes zero. Should add > counter in metricsUpdateForGet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677
[ https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26013: Description: was: After HBASE-25677, the method in HRegion.java {code:java} void metricsUpdateForGet(List results, long before) { if (this.metricsRegion != null) { this.metricsRegion.updateGet(EnvironmentEdgeManager.currentTime() - before); } } {code} does not update the regionserver level metrics. > Get operations readRows metrics becomes zero after HBASE-25677 > -- > > Key: HBASE-26013 > URL: https://issues.apache.org/jira/browse/HBASE-26013 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677
[ https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yutong Xiao updated HBASE-26013: Description: After HBASE-25677, the method in HRegion.java {code:java} void metricsUpdateForGet(List results, long before) { if (this.metricsRegion != null) { this.metricsRegion.updateGet(EnvironmentEdgeManager.currentTime() - before); } } {code} does not update the regionserver level metrics. > Get operations readRows metrics becomes zero after HBASE-25677 > -- > > Key: HBASE-26013 > URL: https://issues.apache.org/jira/browse/HBASE-26013 > Project: HBase > Issue Type: Bug >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > > After HBASE-25677, the method in HRegion.java > {code:java} > void metricsUpdateForGet(List results, long before) { > if (this.metricsRegion != null) { > this.metricsRegion.updateGet(EnvironmentEdgeManager.currentTime() - > before); > } > } > {code} > does not update the regionserver level metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677
Yutong Xiao created HBASE-26013: --- Summary: Get operations readRows metrics becomes zero after HBASE-25677 Key: HBASE-26013 URL: https://issues.apache.org/jira/browse/HBASE-26013 Project: HBase Issue Type: Bug Reporter: Yutong Xiao Assignee: Yutong Xiao -- This message was sent by Atlassian Jira (v8.3.4#803005)