[jira] [Commented] (HBASE-26005) Update ref guide about the EOL for 2.2.x

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365848#comment-17365848
 ] 

Hudson commented on HBASE-26005:


Results for branch master
[build #327 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/327/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/327/General_20Nightly_20Build_20Report/]






(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Update ref guide about the EOL for 2.2.x
> 
>
> Key: HBASE-26005
> URL: https://issues.apache.org/jira/browse/HBASE-26005
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Duo Zhang
>Assignee: Zhuoyue Huang
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> For example, remove the release manager for 2.2.x, and also update the 
> compatibility matrix with hadoop, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dingwei2019 updated HBASE-26016:

Attachment: HBASE-26016-prettyPrintTool-1.patch

> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
> Attachments: HBASE-26016-prettyPrintTool-1.patch
>
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dingwei2019 updated HBASE-26016:

Description: 
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

 

  was:
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

 

 _[blockType=DATA, fileOffset=246617457, headerSize=33, 
onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=0d519e7318414362a56e4f41bf63ccd4, 
cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
nextBlockOnDiskSize=66518]_
_[blockType=DATA, fileOffset=246683975, headerSize=33, 
onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=0d519e7318414362a56e4f41bf63ccd4, 
cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
nextBlockOnDiskSize=51744]_
_[blockType=DATA, fileOffset=246750493, headerSize=33, 
onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, 
prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, 
getOnDiskSizeWithHeader=51744, totalChecksumBytes=16, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=51744, cap= 51777]], 
dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=0d519e7318414362a56e4f41bf63ccd4, 
cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
nextBlockOnDiskSize=20585]_


> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365842#comment-17365842
 ] 

dingwei2019 edited comment on HBASE-26016 at 6/19/21, 3:51 AM:
---

the problem is because the max offset is equal to the start of last datablock, 
but  the last LEAF_INDEX and BLOOM_CHUNK is just behind of the last datablock.

{color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_{color}
 {color:#172b4d} _max = trailer.getLastDataBlockOffset();_{color}
 {color:#172b4d}_HFileBlock block;_{color}
 {color:#172b4d}_while (offset <= max) {_{color}
 {color:#172b4d} _block = reader.readBlock(offset, -1, /* cacheBlock */ false, 
/* pread */ false,_{color}
 {color:#172b4d} _/* isCompaction */ false, /* updateCacheMetrics */ false, 
null, null);_{color}
 {color:#172b4d} _offset += block.getOnDiskSizeWithHeader();_{color}
 {color:#172b4d} _out.println(block);_{color}
 {color:#172b4d}_}_{color}

 

{color:#172b4d}_so it was better to adjust max to the beginning of 
loadonopenblock. see the below:_{color}

{color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_
_max = {color:#ff8b00}trailer.getLoadOnOpenDataOffset();{color}_
_HFileBlock block;_
{color:#ff8b00}_while (offset < max) {_{color}
_block = reader.readBlock(offset, -1, /* cacheBlock */ false, /* pread */ 
false,_
_/* isCompaction */ false, /* updateCacheMetrics */ false, null, null);_
_offset += block.getOnDiskSizeWithHeader();_
_out.println(block);_
_}_{color}


was (Author: dingwei2019):
the problem is because the max offset is equal to the start of last datablock, 
but  the last LEAF_INDEX and BLOOM_CHUNK is just behind of the last datablock.

{color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_{color}
{color:#172b4d} _max = trailer.getLastDataBlockOffset();_{color}
{color:#172b4d}_HFileBlock block;_{color}
{color:#172b4d}_while (offset <= max) {_{color}
{color:#172b4d} _block = reader.readBlock(offset, -1, /* cacheBlock */ false, 
/* pread */ false,_{color}
{color:#172b4d} _/* isCompaction */ false, /* updateCacheMetrics */ false, 
null, null);_{color}
{color:#172b4d} _offset += block.getOnDiskSizeWithHeader();_{color}
{color:#172b4d} _out.println(block);_{color}
{color:#172b4d}_}_{color}

> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
>  
>  _[blockType=DATA, fileOffset=246617457, headerSize=33, 
> onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
> prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
> getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
> buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
> dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE], 
> name=0d519e7318414362a56e4f41bf63ccd4, 
> cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
> nextBlockOnDiskSize=66518]_
> _[blockType=DATA, fileOffset=246683975, headerSize=33, 
> onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
> prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
> getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
> buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
> dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE], 
> name=0d519e7318414362a56e4f41bf63ccd4, 
> cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
> nextBlockOnDiskSize=51744]_
> _[blockType=DATA, fileOffset=246750493, headerSize=33, 
> onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, 
> prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, 
> 

[jira] [Commented] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365842#comment-17365842
 ] 

dingwei2019 commented on HBASE-26016:
-

the problem is because the max offset is equal to the start of last datablock, 
but  the last LEAF_INDEX and BLOOM_CHUNK is just behind of the last datablock.

{color:#172b4d}_long offset = trailer.getFirstDataBlockOffset(),_{color}
{color:#172b4d} _max = trailer.getLastDataBlockOffset();_{color}
{color:#172b4d}_HFileBlock block;_{color}
{color:#172b4d}_while (offset <= max) {_{color}
{color:#172b4d} _block = reader.readBlock(offset, -1, /* cacheBlock */ false, 
/* pread */ false,_{color}
{color:#172b4d} _/* isCompaction */ false, /* updateCacheMetrics */ false, 
null, null);_{color}
{color:#172b4d} _offset += block.getOnDiskSizeWithHeader();_{color}
{color:#172b4d} _out.println(block);_{color}
{color:#172b4d}_}_{color}

> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
>  
>  _[blockType=DATA, fileOffset=246617457, headerSize=33, 
> onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
> prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
> getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
> buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
> dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE], 
> name=0d519e7318414362a56e4f41bf63ccd4, 
> cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
> nextBlockOnDiskSize=66518]_
> _[blockType=DATA, fileOffset=246683975, headerSize=33, 
> onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
> prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
> getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
> buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
> dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE], 
> name=0d519e7318414362a56e4f41bf63ccd4, 
> cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
> nextBlockOnDiskSize=51744]_
> _[blockType=DATA, fileOffset=246750493, headerSize=33, 
> onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, 
> prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, 
> getOnDiskSizeWithHeader=51744, totalChecksumBytes=16, isUnpacked=true, 
> buf=[SingleByteBuff[pos=0, lim=51744, cap= 51777]], 
> dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE], 
> name=0d519e7318414362a56e4f41bf63ccd4, 
> cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
> nextBlockOnDiskSize=20585]_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dingwei2019 updated HBASE-26016:

Description: 
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

 

 _[blockType=DATA, fileOffset=246617457, headerSize=33, 
onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=0d519e7318414362a56e4f41bf63ccd4, 
cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
nextBlockOnDiskSize=66518]_
_[blockType=DATA, fileOffset=246683975, headerSize=33, 
onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=0d519e7318414362a56e4f41bf63ccd4, 
cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
nextBlockOnDiskSize=51744]_
_[blockType=DATA, fileOffset=246750493, headerSize=33, 
onDiskSizeWithoutHeader=51711, uncompressedSizeWithoutHeader=51695, 
prevBlockOffset=246683975, isUseHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, onDiskDataSizeWithHeader=51728, 
getOnDiskSizeWithHeader=51744, totalChecksumBytes=16, isUnpacked=true, 
buf=[SingleByteBuff[pos=0, lim=51744, cap= 51777]], 
dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
includesTags=false, compressAlgo=NONE, compressTags=false, 
cryptoContext=[cipher=NONE keyHash=NONE], 
name=0d519e7318414362a56e4f41bf63ccd4, 
cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
nextBlockOnDiskSize=20585]_

  was:
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

 

 


> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
>  
>  _[blockType=DATA, fileOffset=246617457, headerSize=33, 
> onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
> prevBlockOffset=246550939, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
> getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
> buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
> dataBeginsWith=\x00\x00\x00,\x00\x00\x03\xE8\x00\x1A000816, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE], 
> name=0d519e7318414362a56e4f41bf63ccd4, 
> cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@38b972d7], 
> nextBlockOnDiskSize=66518]_
> _[blockType=DATA, fileOffset=246683975, headerSize=33, 
> onDiskSizeWithoutHeader=66485, uncompressedSizeWithoutHeader=66465, 
> prevBlockOffset=246617457, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=66498, 
> getOnDiskSizeWithHeader=66518, totalChecksumBytes=20, isUnpacked=true, 
> buf=[SingleByteBuff[pos=0, lim=66518, cap= 66551]], 
> 

[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dingwei2019 updated HBASE-26016:

Description: 
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

 

 

  was:
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

!errPrint.PNG!


> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dingwei2019 updated HBASE-26016:

Description: 
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

!errPrint.PNG!

  was:
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

!errPrint.PNG!

 


> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
> !errPrint.PNG!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dingwei2019 updated HBASE-26016:

Description: 
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:

!errPrint.PNG!

 

  was:
when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the attachment named errorPrint.jpg.

 


> HFilePrettyPrinter tool can not print the last LEAF_INDEX block or 
> BLOOM_CHUNK.
> ---
>
> Key: HBASE-26016
> URL: https://issues.apache.org/jira/browse/HBASE-26016
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.0, 2.3.2, 2.4.4
>Reporter: dingwei2019
>Assignee: dingwei2019
>Priority: Minor
>
> when i use pretty printer tool to print the headers of block, i can not get 
> the last LEAF_INDEX block and BLOOM_CHUNK. the last info of the tools is blow:
> !errPrint.PNG!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26016) HFilePrettyPrinter tool can not print the last LEAF_INDEX block or BLOOM_CHUNK.

2021-06-18 Thread dingwei2019 (Jira)
dingwei2019 created HBASE-26016:
---

 Summary: HFilePrettyPrinter tool can not print the last LEAF_INDEX 
block or BLOOM_CHUNK.
 Key: HBASE-26016
 URL: https://issues.apache.org/jira/browse/HBASE-26016
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 2.4.4, 2.3.2, 2.1.0
Reporter: dingwei2019
Assignee: dingwei2019


when i use pretty printer tool to print the headers of block, i can not get the 
last LEAF_INDEX block and BLOOM_CHUNK. the attachment named errorPrint.jpg.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365837#comment-17365837
 ] 

Hudson commented on HBASE-25984:


Results for branch branch-2.4
[build #144 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/144/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> FSHLog WAL lockup with sync future reuse [RS deadlock]
> --
>
> Key: HBASE-25984
> URL: https://issues.apache.org/jira/browse/HBASE-25984
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Critical
>  Labels: deadlock, hang
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 1.7.1, 2.4.5
>
> Attachments: HBASE-25984-unit-test.patch
>
>
> We use FSHLog as the WAL implementation (branch-1 based) and under heavy load 
> we noticed the WAL system gets locked up due to a subtle bug involving racy 
> code with sync future reuse. This bug applies to all FSHLog implementations 
> across branches.
> Symptoms:
> On heavily loaded clusters with large write load we noticed that the region 
> servers are hanging abruptly with filled up handler queues and stuck MVCC 
> indicating appends/syncs not making any progress.
> {noformat}
>  WARN  [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, 
> regionName=1ce4003ab60120057734ffe367667dca}
>  WARN  [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, 
> regionName=7c441d7243f9f504194dae6bf2622631}
> {noformat}
> All the handlers are stuck waiting for the sync futures and timing out.
> {noformat}
>  java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509)
> .
> {noformat}
> Log rolling is stuck because it was unable to attain a safe point
> {noformat}
>java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799)
>  
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900)
> {noformat}
> and the Ring buffer consumer thinks that there are some outstanding syncs 
> that need to finish..
> {noformat}
>   
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857)
> {noformat}
> On the other hand, SyncRunner threads are idle and just waiting for work 
> implying that there are no pending SyncFutures that need to be run
> {noformat}
>sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> Overall the WAL system is dead locked and could make no progress until it was 
> aborted. I got to the bottom of this issue and have a patch 

[jira] [Assigned] (HBASE-26015) Should implement getRegionServers(boolean) method in AsyncAdmin

2021-06-18 Thread Zhuoyue Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoyue Huang reassigned HBASE-26015:
-

Assignee: Zhuoyue Huang

> Should implement getRegionServers(boolean) method in AsyncAdmin
> ---
>
> Key: HBASE-26015
> URL: https://issues.apache.org/jira/browse/HBASE-26015
> Project: HBase
>  Issue Type: Task
>  Components: Admin, Client
>Reporter: Duo Zhang
>Assignee: Zhuoyue Huang
>Priority: Major
>
> We have this method in Admin but not in AsyncAdmin, we should align these two 
> interfaces.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-18 Thread Bharath Vissapragada (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Vissapragada updated HBASE-25998:
-
Fix Version/s: 2.4.5
   2.3.6
   2.5.0
   3.0.0-alpha-1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5
>
> Attachments: monitor-overhead-1.png, monitor-overhead-2.png
>
>
> While working on HBASE-25984, I noticed some weird frames in the flame graphs 
> around monitor entry exit consuming a lot of CPU cycles (see attached 
> images). Noticed that the synchronization there is too coarse grained and 
> sometimes unnecessary. I did a simple patch that switched to a reentrant lock 
> based synchronization with condition variable rather than a busy wait and 
> that showed 70-80% increased throughput in WAL PE. Seems too good to be 
> true.. (more details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]

2021-06-18 Thread Bharath Vissapragada (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Vissapragada updated HBASE-25984:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> FSHLog WAL lockup with sync future reuse [RS deadlock]
> --
>
> Key: HBASE-25984
> URL: https://issues.apache.org/jira/browse/HBASE-25984
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Critical
>  Labels: deadlock, hang
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 1.7.1, 2.4.5
>
> Attachments: HBASE-25984-unit-test.patch
>
>
> We use FSHLog as the WAL implementation (branch-1 based) and under heavy load 
> we noticed the WAL system gets locked up due to a subtle bug involving racy 
> code with sync future reuse. This bug applies to all FSHLog implementations 
> across branches.
> Symptoms:
> On heavily loaded clusters with large write load we noticed that the region 
> servers are hanging abruptly with filled up handler queues and stuck MVCC 
> indicating appends/syncs not making any progress.
> {noformat}
>  WARN  [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, 
> regionName=1ce4003ab60120057734ffe367667dca}
>  WARN  [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, 
> regionName=7c441d7243f9f504194dae6bf2622631}
> {noformat}
> All the handlers are stuck waiting for the sync futures and timing out.
> {noformat}
>  java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509)
> .
> {noformat}
> Log rolling is stuck because it was unable to attain a safe point
> {noformat}
>java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799)
>  
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900)
> {noformat}
> and the Ring buffer consumer thinks that there are some outstanding syncs 
> that need to finish..
> {noformat}
>   
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857)
> {noformat}
> On the other hand, SyncRunner threads are idle and just waiting for work 
> implying that there are no pending SyncFutures that need to be run
> {noformat}
>sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> Overall the WAL system is dead locked and could make no progress until it was 
> aborted. I got to the bottom of this issue and have a patch that can fix it 
> (more details in the comments due to word limit in the description).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]

2021-06-18 Thread Bharath Vissapragada (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Vissapragada updated HBASE-25984:
-
Fix Version/s: 2.4.5
   1.7.1
   2.3.6
   2.5.0
   3.0.0-alpha-1

> FSHLog WAL lockup with sync future reuse [RS deadlock]
> --
>
> Key: HBASE-25984
> URL: https://issues.apache.org/jira/browse/HBASE-25984
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Critical
>  Labels: deadlock, hang
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 1.7.1, 2.4.5
>
> Attachments: HBASE-25984-unit-test.patch
>
>
> We use FSHLog as the WAL implementation (branch-1 based) and under heavy load 
> we noticed the WAL system gets locked up due to a subtle bug involving racy 
> code with sync future reuse. This bug applies to all FSHLog implementations 
> across branches.
> Symptoms:
> On heavily loaded clusters with large write load we noticed that the region 
> servers are hanging abruptly with filled up handler queues and stuck MVCC 
> indicating appends/syncs not making any progress.
> {noformat}
>  WARN  [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, 
> regionName=1ce4003ab60120057734ffe367667dca}
>  WARN  [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, 
> regionName=7c441d7243f9f504194dae6bf2622631}
> {noformat}
> All the handlers are stuck waiting for the sync futures and timing out.
> {noformat}
>  java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509)
> .
> {noformat}
> Log rolling is stuck because it was unable to attain a safe point
> {noformat}
>java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799)
>  
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900)
> {noformat}
> and the Ring buffer consumer thinks that there are some outstanding syncs 
> that need to finish..
> {noformat}
>   
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857)
> {noformat}
> On the other hand, SyncRunner threads are idle and just waiting for work 
> implying that there are no pending SyncFutures that need to be run
> {noformat}
>sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> Overall the WAL system is dead locked and could make no progress until it was 
> aborted. I got to the bottom of this issue and have a patch that can fix it 
> (more details in the comments due to word limit in the description).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hbase] bharathv merged pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


bharathv merged pull request #3398:
URL: https://github.com/apache/hbase/pull/3398


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3398:
URL: https://github.com/apache/hbase/pull/3398#issuecomment-864327048


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 32s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any 
anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
2 new or modified test files.  |
   ||| _ branch-1 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   9m 50s |  branch-1 passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.8.0_262-b19  |
   | +1 :green_heart: |  compile  |   0m 43s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.7.0_272-b10  |
   | +1 :green_heart: |  checkstyle  |   1m 43s |  branch-1 passed  |
   | +1 :green_heart: |  shadedjars  |   3m  5s |  branch has no errors when 
building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.8.0_262-b19  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.7.0_272-b10  |
   | +0 :ok: |  spotbugs  |   3m  1s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   2m 58s |  branch-1 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 53s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  the patch passed with JDK Azul 
Systems, Inc.-1.8.0_262-b19  |
   | +1 :green_heart: |  javac  |   0m 41s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  the patch passed with JDK Azul 
Systems, Inc.-1.7.0_272-b10  |
   | +1 :green_heart: |  javac  |   0m 44s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 32s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  shadedjars  |   2m 49s |  patch has no errors when 
building our shaded downstream artifacts.  |
   | +1 :green_heart: |  hadoopcheck  |   4m 29s |  Patch does not cause any 
errors with Hadoop 2.8.5 2.9.2.  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  the patch passed with JDK Azul 
Systems, Inc.-1.8.0_262-b19  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed with JDK Azul 
Systems, Inc.-1.7.0_272-b10  |
   | +1 :green_heart: |  findbugs  |   2m 52s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 141m 52s |  hbase-server in the patch passed.  
|
   | +1 :green_heart: |  asflicense  |   0m 39s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 183m  5s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3398/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hbase/pull/3398 |
   | JIRA Issue | HBASE-25984 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
shadedjars hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 68f3d4528118 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 
/home/jenkins/jenkins-agent/workspace/Base-PreCommit-GitHub-PR_PR-3398/out/precommit/personality/provided.sh
 |
   | git revision | branch-1 / a40f458 |
   | Default Java | Azul Systems, Inc.-1.7.0_272-b10 |
   | Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:Azul Systems, 
Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3398/3/testReport/
 |
   | Max. process+thread count | 4305 (vs. ulimit of 1) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | 
https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3398/3/console
 |
   | versions | git=1.9.1 maven=3.0.5 findbugs=3.0.1 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3398:
URL: https://github.com/apache/hbase/pull/3398#issuecomment-863526314






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache9 commented on a change in pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue

2021-06-18 Thread GitBox


Apache9 commented on a change in pull request #3397:
URL: https://github.com/apache/hbase/pull/3397#discussion_r654542588



##
File path: 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java
##
@@ -79,7 +84,13 @@ public String toString() {
*/
   public static  E takeWithoutInterrupt(final DelayQueue 
queue) {
 try {
-  return queue.take();
+  E element = queue.poll(10, TimeUnit.SECONDS);
+  if (element == null && queue.size() > 0) {
+LOG.error("DelayQueue is not empty when timed waiting elapsed. If this 
is repeated for"

Review comment:
   This maybe too aggressive? Why choose 10 seconds as timeout here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv merged pull request #3400: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


bharathv merged pull request #3400:
URL: https://github.com/apache/hbase/pull/3400


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3397:
URL: https://github.com/apache/hbase/pull/3397#issuecomment-863442939






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] tomscut removed a comment on pull request #3325: HBASE-25934 Add username for RegionScannerHolder

2021-06-18 Thread GitBox


tomscut removed a comment on pull request #3325:
URL: https://github.com/apache/hbase/pull/3325#issuecomment-856481326


   Hi @saintstack , could you please take a look and merge the code? Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv opened a new pull request #3401: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


bharathv opened a new pull request #3401:
URL: https://github.com/apache/hbase/pull/3401


   Currently uses coarse grained synchronized approach that seems to
   create a lot of contention. This patch
   
   - Uses a reentrant lock instead of synchronized monitor
   - Switches to a condition variable based waiting rather than busy wait
   - Removed synchronization for unnecessary fields
   
   Signed-off-by: Michael Stack 
   Signed-off-by: Andrew Purtell 
   Signed-off-by: Duo Zhang 
   Signed-off-by: Viraj Jasani 
   (cherry picked from commit 6bafb596421974717697b28d0856453245759c15)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv merged pull request #3394: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


bharathv merged pull request #3394:
URL: https://github.com/apache/hbase/pull/3394


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] vli02 opened a new pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:

2021-06-18 Thread GitBox


vli02 opened a new pull request #3402:
URL: https://github.com/apache/hbase/pull/3402


   hbck fixes and offlines some regions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv merged pull request #3382: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


bharathv merged pull request #3382:
URL: https://github.com/apache/hbase/pull/3382


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3401: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3401:
URL: https://github.com/apache/hbase/pull/3401#issuecomment-863548708






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3385: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3385:
URL: https://github.com/apache/hbase/pull/3385#issuecomment-863451788






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache9 merged pull request #3388: HBASE-26005 Update ref guide about the EOL for 2.2.x

2021-06-18 Thread GitBox


Apache9 merged pull request #3388:
URL: https://github.com/apache/hbase/pull/3388


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv opened a new pull request #3398: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


bharathv opened a new pull request #3398:
URL: https://github.com/apache/hbase/pull/3398


   Signed-off-by: Viraj Jasani vjas...@apache.org
   (cherry picked from commit 5a19bcf)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3394: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3394:
URL: https://github.com/apache/hbase/pull/3394#issuecomment-863507742






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] virajjasani opened a new pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue

2021-06-18 Thread GitBox


virajjasani opened a new pull request #3397:
URL: https://github.com/apache/hbase/pull/3397


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache9 commented on pull request #3388: HBASE-26005 Update ref guide about the EOL for 2.2.x

2021-06-18 Thread GitBox


Apache9 commented on pull request #3388:
URL: https://github.com/apache/hbase/pull/3388#issuecomment-864139944


   Oh, seems something wrong with the jenkins website, the table in the 
original ref guide is also empty. Let me merge.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Reidddddd commented on pull request #3387: HBASE-26004: port HBASE-26001 (cell level tags invisible in atomic operations when access control is on)to branch-1

2021-06-18 Thread GitBox


Reidd commented on pull request #3387:
URL: https://github.com/apache/hbase/pull/3387#issuecomment-863986392


   Please fix the findbugs and checkstyle warnings @YutSean, thx


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3215: HBASE-25698 Fixing IllegalReferenceCountException when using TinyLfuBlockCache

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3215:
URL: https://github.com/apache/hbase/pull/3215#issuecomment-863959397






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3392: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3392:
URL: https://github.com/apache/hbase/pull/3392#issuecomment-863506222






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] YutSean opened a new pull request #3403: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid

2021-06-18 Thread GitBox


YutSean opened a new pull request #3403:
URL: https://github.com/apache/hbase/pull/3403


   https://issues.apache.org/jira/browse/HBASE-26001


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] virajjasani commented on a change in pull request #3397: HBASE-26012 Improve logging and dequeue logic in DelayQueue

2021-06-18 Thread GitBox


virajjasani commented on a change in pull request #3397:
URL: https://github.com/apache/hbase/pull/3397#discussion_r654562565



##
File path: 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java
##
@@ -79,7 +84,13 @@ public String toString() {
*/
   public static  E takeWithoutInterrupt(final DelayQueue 
queue) {
 try {
-  return queue.take();
+  E element = queue.poll(10, TimeUnit.SECONDS);
+  if (element == null && queue.size() > 0) {
+LOG.error("DelayQueue is not empty when timed waiting elapsed. If this 
is repeated for"

Review comment:
   Since the default value of 
`hbase.procedure.remote.dispatcher.delay.msec` is just 150, I thought 10s might 
be enough. But I am open to keep it higher. What do you think would be better 
value?

##
File path: 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java
##
@@ -79,7 +84,13 @@ public String toString() {
*/
   public static  E takeWithoutInterrupt(final DelayQueue 
queue) {
 try {
-  return queue.take();
+  E element = queue.poll(10, TimeUnit.SECONDS);
+  if (element == null && queue.size() > 0) {
+LOG.error("DelayQueue is not empty when timed waiting elapsed. If this 
is repeated for"

Review comment:
   Since the default value of 
`hbase.procedure.remote.dispatcher.delay.msec` is just 150, I thought 10s might 
be enough. But I am open to keep it higher. What do you think would be better 
value? Maybe 25/30s or 60s?

##
File path: 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/util/DelayedUtil.java
##
@@ -79,7 +84,13 @@ public String toString() {
*/
   public static  E takeWithoutInterrupt(final DelayQueue 
queue) {
 try {
-  return queue.take();
+  E element = queue.poll(10, TimeUnit.SECONDS);
+  if (element == null && queue.size() > 0) {
+LOG.error("DelayQueue is not empty when timed waiting elapsed. If this 
is repeated for"

Review comment:
   Since the default value of 
`hbase.procedure.remote.dispatcher.delay.msec` is just 150, I thought 10s might 
be enough. But I am open to keeping it higher. What do you think would be 
better value? Maybe 25/30s or 60s?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3403: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3403:
URL: https://github.com/apache/hbase/pull/3403#issuecomment-863825776






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3400: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3400:
URL: https://github.com/apache/hbase/pull/3400#issuecomment-863553502






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] rda3mon commented on pull request #3359: HBASE-25891 remove dependence storing wal filenames for backup

2021-06-18 Thread GitBox


rda3mon commented on pull request #3359:
URL: https://github.com/apache/hbase/pull/3359#issuecomment-863812182


   @Apache9 Can you review this? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv merged pull request #3401: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


bharathv merged pull request #3401:
URL: https://github.com/apache/hbase/pull/3401


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv merged pull request #3393: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


bharathv merged pull request #3393:
URL: https://github.com/apache/hbase/pull/3393


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3402:
URL: https://github.com/apache/hbase/pull/3402#issuecomment-863654404






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3387: HBASE-26004: port HBASE-26001 (cell level tags invisible in atomic operations when access control is on)to branch-1

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3387:
URL: https://github.com/apache/hbase/pull/3387#issuecomment-863981047






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3360: HBASE-25975 Row Commit Sequencer

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3360:
URL: https://github.com/apache/hbase/pull/3360#issuecomment-863632737






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] virajjasani commented on a change in pull request #3215: HBASE-25698 Fixing IllegalReferenceCountException when using TinyLfuBlockCache

2021-06-18 Thread GitBox


virajjasani commented on a change in pull request #3215:
URL: https://github.com/apache/hbase/pull/3215#discussion_r654285359



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
##
@@ -171,8 +177,10 @@ public Cacheable getBlock(BlockCacheKey cacheKey,
 if ((value != null) && caching) {
   if ((value instanceof HFileBlock) && ((HFileBlock) 
value).isSharedMem()) {
 value = HFileBlock.deepCloneOnHeap((HFileBlock) value);
+cacheBlockUtil(cacheKey, value, true);

Review comment:
   > U can do the deepclone in asReferencedHeapBlock() only based on 
isSharedMem right? retain() call is anyways needed
   
   LRUBlockCache does not perform block.retain() if block is cloned:
   ```
  * 1. if cache the cloned heap block, its refCnt is an totally new one, 
it's easy to handle; 
  * 2. if cache the original heap block, we're sure that it won't be 
tracked in ByteBuffAllocator's
  * reservoir, if both RPC and LRUBlockCache release the block, then it can 
be garbage collected by
  * JVM, so need a retain here.
   ```
   
   ```
 private Cacheable asReferencedHeapBlock(Cacheable buf) {
   if (buf instanceof HFileBlock) {
 HFileBlock blk = ((HFileBlock) buf);
 if (blk.isSharedMem()) {
   return HFileBlock.deepCloneOnHeap(blk);
 }
   }
   // The block will be referenced by this LRUBlockCache, so should 
increase its refCnt here.
   return buf.retain();
 }
   ```

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
##
@@ -171,8 +177,10 @@ public Cacheable getBlock(BlockCacheKey cacheKey,
 if ((value != null) && caching) {
   if ((value instanceof HFileBlock) && ((HFileBlock) 
value).isSharedMem()) {
 value = HFileBlock.deepCloneOnHeap((HFileBlock) value);
+cacheBlockUtil(cacheKey, value, true);

Review comment:
   @anoopsjohn This is simplified now.

##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
##
@@ -188,21 +196,58 @@ public void cacheBlock(BlockCacheKey cacheKey, Cacheable 
value, boolean inMemory
 
   @Override
   public void cacheBlock(BlockCacheKey key, Cacheable value) {
+cacheBlockUtil(key, value, false);
+  }
+
+  private void cacheBlockUtil(BlockCacheKey key, Cacheable value, boolean 
deepClonedOnHeap) {

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] d-c-manning commented on a change in pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:

2021-06-18 Thread GitBox


d-c-manning commented on a change in pull request #3402:
URL: https://github.com/apache/hbase/pull/3402#discussion_r654022634



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
##
@@ -1661,6 +1661,17 @@ public void regionOffline(final HRegionInfo regionInfo) {
 regionOffline(regionInfo, null);
   }
 
+  /**
+   * Marks the region as offline. In addition whether removing it from
+   * replicas and master in-memory server holding map.
+   * 
+   * @param regionInfo
+   * @param force

Review comment:
   let's add some descriptive text about why `force` would be used. 
Specifically that no known use case should have to use it except hbck, which 
desires to force a region offline and not have it ever be reopened on another 
server.

##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
##
@@ -882,6 +882,80 @@ public void testDupeStartKey() throws Exception {
   assertNoErrors(hbck2);
   assertEquals(0, hbck2.getOverlapGroups(table).size());
   assertEquals(ROWKEYS.length, countRows());
+
+  MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+  long totalRegions = cluster.countServedRegions();
+
+  // stop a region servers and run fsck again
+  cluster.stopRegionServer(server);
+  cluster.waitForRegionServerToStop(server, 60);
+
+  // wait for all regions to come online.
+  while (cluster.countServedRegions() < totalRegions) {
+try {
+  Thread.sleep(100);
+} catch (InterruptedException e) {}

Review comment:
   does this `InterruptedException` need to be caught? Can't the test 
method throw it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Reidddddd merged pull request #3385: HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid

2021-06-18 Thread GitBox


Reidd merged pull request #3385:
URL: https://github.com/apache/hbase/pull/3385


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] anoopsjohn commented on a change in pull request #3215: HBASE-25698 Fixing IllegalReferenceCountException when using TinyLfuBlockCache

2021-06-18 Thread GitBox


anoopsjohn commented on a change in pull request #3215:
URL: https://github.com/apache/hbase/pull/3215#discussion_r654194628



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
##
@@ -171,8 +177,10 @@ public Cacheable getBlock(BlockCacheKey cacheKey,
 if ((value != null) && caching) {
   if ((value instanceof HFileBlock) && ((HFileBlock) 
value).isSharedMem()) {
 value = HFileBlock.deepCloneOnHeap((HFileBlock) value);
+cacheBlockUtil(cacheKey, value, true);

Review comment:
   Pls refer the code in LRUBlockCache.  U can do the deepclone in 
asReferencedHeapBlock() only based on isSharedMem right?  retain() call is 
anyways needed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] YutSean opened a new pull request #3404: HBASE-26013 Get operations readRows metrics becomes zero after HBASE-25677

2021-06-18 Thread GitBox


YutSean opened a new pull request #3404:
URL: https://github.com/apache/hbase/pull/3404


   https://issues.apache.org/jira/browse/HBASE-26013


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3399: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3399:
URL: https://github.com/apache/hbase/pull/3399#issuecomment-863541666






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3404: HBASE-26013 Get operations readRows metrics becomes zero after HBASE-25677

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3404:
URL: https://github.com/apache/hbase/pull/3404#issuecomment-863935433






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] vli02 commented on a change in pull request #3402: HBASE-25130 - Fix master in-memory server holding map after:

2021-06-18 Thread GitBox


vli02 commented on a change in pull request #3402:
URL: https://github.com/apache/hbase/pull/3402#discussion_r654037619



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
##
@@ -1661,6 +1661,17 @@ public void regionOffline(final HRegionInfo regionInfo) {
 regionOffline(regionInfo, null);
   }
 
+  /**
+   * Marks the region as offline. In addition whether removing it from
+   * replicas and master in-memory server holding map.
+   * 
+   * @param regionInfo
+   * @param force

Review comment:
   updated. thanks!

##
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
##
@@ -882,6 +882,80 @@ public void testDupeStartKey() throws Exception {
   assertNoErrors(hbck2);
   assertEquals(0, hbck2.getOverlapGroups(table).size());
   assertEquals(ROWKEYS.length, countRows());
+
+  MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+  long totalRegions = cluster.countServedRegions();
+
+  // stop a region servers and run fsck again
+  cluster.stopRegionServer(server);
+  cluster.waitForRegionServerToStop(server, 60);
+
+  // wait for all regions to come online.
+  while (cluster.countServedRegions() < totalRegions) {
+try {
+  Thread.sleep(100);
+} catch (InterruptedException e) {}

Review comment:
   Sleep can be waken up by interruption before time is up, that should be 
normal, I want it not to break the loop until we finished waiting for all 
regions to come up.
   Thread.sleep() method is defined this way: 
https://www.tutorialspoint.com/java/lang/thread_sleep_millis.htm




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #2941: HBASE-21674:Port HBASE-21652 (Refactor ThriftServer making thrift2 server inherited from thrift1 server) to branch-1

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #2941:
URL: https://github.com/apache/hbase/pull/2941#issuecomment-864010643


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   1m 21s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  2s |  No case conflicting files 
found.  |
   | +0 :ok: |  jshint  |   0m  0s |  jshint was not available.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any 
anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
9 new or modified test files.  |
   ||| _ branch-1 Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 25s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   8m  8s |  branch-1 passed  |
   | +1 :green_heart: |  compile  |   0m 49s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.8.0_262-b19  |
   | +1 :green_heart: |  compile  |   0m 55s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.7.0_272-b10  |
   | +1 :green_heart: |  checkstyle  |   1m 10s |  branch-1 passed  |
   | -1 :x: |  shadedjars  |   0m 18s |  branch has 7 errors when building our 
shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.8.0_262-b19  |
   | +1 :green_heart: |  javadoc  |   2m 13s |  branch-1 passed with JDK Azul 
Systems, Inc.-1.7.0_272-b10  |
   | +0 :ok: |  spotbugs  |   1m 53s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 17s |  branch-1 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 18s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  4s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 48s |  the patch passed with JDK Azul 
Systems, Inc.-1.8.0_262-b19  |
   | +1 :green_heart: |  javac  |   0m 48s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 57s |  the patch passed with JDK Azul 
Systems, Inc.-1.7.0_272-b10  |
   | -1 :x: |  javac  |   0m 35s |  
hbase-thrift-jdkAzulSystems,Inc.-1.7.0_272-b10 with JDK Azul Systems, 
Inc.-1.7.0_272-b10 generated 44 new + 63 unchanged - 41 fixed = 107 total (was 
104)  |
   | -1 :x: |  checkstyle  |   0m 36s |  hbase-thrift: The patch generated 2 
new + 82 unchanged - 77 fixed = 84 total (was 159)  |
   | +1 :green_heart: |  whitespace  |   0m  1s |  The patch has no whitespace 
issues.  |
   | -1 :x: |  xml  |   0m  0s |  The patch has 1 ill-formed XML file(s).  |
   | -1 :x: |  shadedjars  |   0m 12s |  patch has 7 errors when building our 
shaded downstream artifacts.  |
   | +1 :green_heart: |  hadoopcheck  |   4m 50s |  Patch does not cause any 
errors with Hadoop 2.8.5 2.9.2.  |
   | -1 :x: |  javadoc  |   0m 32s |  
hbase-thrift-jdkAzulSystems,Inc.-1.8.0_262-b19 with JDK Azul Systems, 
Inc.-1.8.0_262-b19 generated 13 new + 0 unchanged - 0 fixed = 13 total (was 0)  
|
   | -1 :x: |  javadoc  |   3m 20s |  
hbase-thrift-jdkAzulSystems,Inc.-1.7.0_272-b10 with JDK Azul Systems, 
Inc.-1.7.0_272-b10 generated 13 new + 0 unchanged - 0 fixed = 13 total (was 0)  
|
   | +1 :green_heart: |  findbugs  |   3m 36s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 45s |  hbase-common in the patch passed.  
|
   | -1 :x: |  unit  |   0m 37s |  hbase-thrift in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 24s |  The patch does not generate 
ASF License warnings.  |
   |  |   |  48m 49s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | XML | Parsing Error(s): |
   |   | dev-support/hbase_eclipse_formatter.xml |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2941/15/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hbase/pull/2941 |
   | JIRA Issue | HBASE-21674 |
   | Optional Tests | dupname asflicense xml javac javadoc unit spotbugs 
findbugs shadedjars hadoopcheck hbaseanti checkstyle compile jshint |
   | uname | Linux e825992bfaaf 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 
/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-2941/out/precommit/personality/provided.sh
 |
   | git revision | branch-1 / a40f458 |
   | Default Java | Azul Systems, Inc.-1.7.0_272-b10 |
   | Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:Azul Systems, 
Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10 |
   | shadedjars | 

[GitHub] [hbase] bharathv merged pull request #3399: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


bharathv merged pull request #3399:
URL: https://github.com/apache/hbase/pull/3399


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv opened a new pull request #3400: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


bharathv opened a new pull request #3400:
URL: https://github.com/apache/hbase/pull/3400


   Currently uses coarse grained synchronized approach that seems to
   create a lot of contention. This patch
   
   - Uses a reentrant lock instead of synchronized monitor
   - Switches to a condition variable based waiting rather than busy wait
   - Removed synchronization for unnecessary fields
   
   Signed-off-by: Michael Stack 
   Signed-off-by: Andrew Purtell 
   Signed-off-by: Duo Zhang 
   Signed-off-by: Viraj Jasani 
   (cherry picked from commit 6bafb596421974717697b28d0856453245759c15)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv opened a new pull request #3399: HBASE-25998: Redo synchronization in SyncFuture

2021-06-18 Thread GitBox


bharathv opened a new pull request #3399:
URL: https://github.com/apache/hbase/pull/3399


   Currently uses coarse grained synchronized approach that seems to
   create a lot of contention. This patch
   
   - Uses a reentrant lock instead of synchronized monitor
   - Switches to a condition variable based waiting rather than busy wait
   - Removed synchronization for unnecessary fields
   
   Signed-off-by: Michael Stack 
   Signed-off-by: Andrew Purtell 
   Signed-off-by: Duo Zhang 
   Signed-off-by: Viraj Jasani 
   (cherry picked from commit 6bafb596421974717697b28d0856453245759c15)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] bharathv merged pull request #3392: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


bharathv merged pull request #3392:
URL: https://github.com/apache/hbase/pull/3392


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] Apache-HBase commented on pull request #3393: HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)

2021-06-18 Thread GitBox


Apache-HBase commented on pull request #3393:
URL: https://github.com/apache/hbase/pull/3393#issuecomment-863506798






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HBASE-26001) When turn on access control, the cell level TTL of Increment and Append operations is invalid.

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365693#comment-17365693
 ] 

Hudson commented on HBASE-26001:


Results for branch master
[build #326 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/General_20Nightly_20Build_20Report/]






(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> When turn on access control, the cell level TTL of Increment and Append 
> operations is invalid.
> --
>
> Key: HBASE-26001
> URL: https://issues.apache.org/jira/browse/HBASE-26001
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> AccessController postIncrementBeforeWAL() and postAppendBeforeWAL() methods 
> will rewrite the new cell's tags by the old cell's. This will makes the other 
> kinds of tag in new cell invisible (such as TTL tag) after this. As in 
> Increment and Append operations, the new cell has already catch forward all 
> tags of the old cell and TTL tag from mutation operation, here in 
> AccessController we do not need to rewrite the tags once again. Also, the TTL 
> tag of newCell will be invisible in the new created cell. Actually, in 
> Increment and Append operations, the newCell has already copied all tags of 
> the oldCell. So the oldCell is useless here.
> {code:java}
> private Cell createNewCellWithTags(Mutation mutation, Cell oldCell, Cell 
> newCell) {
> // Collect any ACLs from the old cell
> List tags = Lists.newArrayList();
> List aclTags = Lists.newArrayList();
> ListMultimap perms = ArrayListMultimap.create();
> if (oldCell != null) {
>   Iterator tagIterator = PrivateCellUtil.tagsIterator(oldCell);
>   while (tagIterator.hasNext()) {
> Tag tag = tagIterator.next();
> if (tag.getType() != PermissionStorage.ACL_TAG_TYPE) {
>   // Not an ACL tag, just carry it through
>   if (LOG.isTraceEnabled()) {
> LOG.trace("Carrying forward tag from " + oldCell + ": type " + 
> tag.getType()
> + " length " + tag.getValueLength());
>   }
>   tags.add(tag);
> } else {
>   aclTags.add(tag);
> }
>   }
> }
> // Do we have an ACL on the operation?
> byte[] aclBytes = mutation.getACL();
> if (aclBytes != null) {
>   // Yes, use it
>   tags.add(new ArrayBackedTag(PermissionStorage.ACL_TAG_TYPE, aclBytes));
> } else {
>   // No, use what we carried forward
>   if (perms != null) {
> // TODO: If we collected ACLs from more than one tag we may have a
> // List of size > 1, this can be collapsed into a single
> // Permission
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Carrying forward ACLs from " + oldCell + ": " + perms);
> }
> tags.addAll(aclTags);
>   }
> }
> // If we have no tags to add, just return
> if (tags.isEmpty()) {
>   return newCell;
> }
> // Here the new cell's tags will be in visible.
> return PrivateCellUtil.createCell(newCell, tags);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365692#comment-17365692
 ] 

Hudson commented on HBASE-25998:


Results for branch master
[build #326 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/General_20Nightly_20Build_20Report/]






(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Attachments: monitor-overhead-1.png, monitor-overhead-2.png
>
>
> While working on HBASE-25984, I noticed some weird frames in the flame graphs 
> around monitor entry exit consuming a lot of CPU cycles (see attached 
> images). Noticed that the synchronization there is too coarse grained and 
> sometimes unnecessary. I did a simple patch that switched to a reentrant lock 
> based synchronization with condition variable rather than a busy wait and 
> that showed 70-80% increased throughput in WAL PE. Seems too good to be 
> true.. (more details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25976) Implement a master based ReplicationTracker

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365691#comment-17365691
 ] 

Hudson commented on HBASE-25976:


Results for branch master
[build #326 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/General_20Nightly_20Build_20Report/]






(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Implement a master based ReplicationTracker
> ---
>
> Key: HBASE-25976
> URL: https://issues.apache.org/jira/browse/HBASE-25976
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> Now the only thing we care about is the live region servers and we can get 
> this information from master, so let's do it to remove the dependencies on 
> zookeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365648#comment-17365648
 ] 

Hudson commented on HBASE-25984:


Results for branch branch-2
[build #279 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
-- Something went wrong with this stage, [check relevant console 
output|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279//console].


> FSHLog WAL lockup with sync future reuse [RS deadlock]
> --
>
> Key: HBASE-25984
> URL: https://issues.apache.org/jira/browse/HBASE-25984
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Critical
>  Labels: deadlock, hang
> Attachments: HBASE-25984-unit-test.patch
>
>
> We use FSHLog as the WAL implementation (branch-1 based) and under heavy load 
> we noticed the WAL system gets locked up due to a subtle bug involving racy 
> code with sync future reuse. This bug applies to all FSHLog implementations 
> across branches.
> Symptoms:
> On heavily loaded clusters with large write load we noticed that the region 
> servers are hanging abruptly with filled up handler queues and stuck MVCC 
> indicating appends/syncs not making any progress.
> {noformat}
>  WARN  [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, 
> regionName=1ce4003ab60120057734ffe367667dca}
>  WARN  [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, 
> regionName=7c441d7243f9f504194dae6bf2622631}
> {noformat}
> All the handlers are stuck waiting for the sync futures and timing out.
> {noformat}
>  java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509)
> .
> {noformat}
> Log rolling is stuck because it was unable to attain a safe point
> {noformat}
>java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799)
>  
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900)
> {noformat}
> and the Ring buffer consumer thinks that there are some outstanding syncs 
> that need to finish..
> {noformat}
>   
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857)
> {noformat}
> On the other hand, SyncRunner threads are idle and just waiting for work 
> implying that there are no pending SyncFutures that need to be run
> {noformat}
>sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> Overall the WAL system is dead locked and could make no progress 

[jira] [Commented] (HBASE-25976) Implement a master based ReplicationTracker

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365647#comment-17365647
 ] 

Hudson commented on HBASE-25976:


Results for branch branch-2
[build #279 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
-- Something went wrong with this stage, [check relevant console 
output|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/279//console].


> Implement a master based ReplicationTracker
> ---
>
> Key: HBASE-25976
> URL: https://issues.apache.org/jira/browse/HBASE-25976
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> Now the only thing we care about is the live region servers and we can get 
> this information from master, so let's do it to remove the dependencies on 
> zookeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?

2021-06-18 Thread Emil Kleszcz (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637
 ] 

Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 6:05 PM:


Hi, we experienced the same issue in HBase 2.3.4 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
 We run on the HDP 3.2.1. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync 
result after 30 ms for txid=54056852, WAL system stuck?
(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
 This triggered an avalanche of stuck procedures in HMaster
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
In HA the Hmasters started flipping over and we could observe more and more 
RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or 
null). Only the manual fix (forcing states for tables/regions) helped to 
recover the cluster.

I hope you apply the working patch soon.


was (Author: tr0k):
Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
 We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync 
result after 30 ms for txid=54056852, WAL system stuck?
(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
 This triggered an avalanche of stuck procedures in HMaster
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
In HA the Hmasters started flipping over and we could observe more and more 
RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or 
null). Only the manual fix (forcing states for tables/regions) helped to 
recover the cluster.

I hope you apply the working patch soon.

> [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL 
> system stuck?
> -
>
> Key: HBASE-20503
> URL: https://issues.apache.org/jira/browse/HBASE-20503
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Attachments: 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch
>
>
> Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to 
> furiously open regions assigned by Master. It is importantly carrying 
> hbase:meta. Twenty minutes in, meta goes dead after an exception up out 
> AsyncFSWAL. Process had been restarted so I couldn't get a  thread dump. 
> 

[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?

2021-06-18 Thread Emil Kleszcz (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637
 ] 

Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 6:04 PM:


Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
 We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync 
result after 30 ms for txid=54056852, WAL system stuck?
(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
 This triggered an avalanche of stuck procedures in HMaster
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
In HA the Hmasters started flipping over and we could observe more and more 
RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or 
null). Only the manual fix (forcing states for tables/regions) helped to 
recover the cluster.

I hope you apply the working patch soon.


was (Author: tr0k):
Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
 We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync 
result after 30 ms for txid=54056852, WAL system stuck?(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
 This triggered an avalanche of stuck procedures in HMaster
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
In HA the Hmasters started flipping over and we could observe more and more 
RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or 
null). Only the manual fix (forcing states for tables/regions) helped to 
recover the cluster.

I hope you apply the working patch soon.

> [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL 
> system stuck?
> -
>
> Key: HBASE-20503
> URL: https://issues.apache.org/jira/browse/HBASE-20503
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Attachments: 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch
>
>
> Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to 
> furiously open regions assigned by Master. It is importantly carrying 
> hbase:meta. Twenty minutes in, meta goes dead after an exception up out 
> AsyncFSWAL. Process had been restarted so I couldn't get a  thread dump. 
> 

[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?

2021-06-18 Thread Emil Kleszcz (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637
 ] 

Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 6:02 PM:


Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
 We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync 
result after 30 ms for txid=54056852, WAL system stuck?(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
 This triggered an avalanche of stuck procedures in HMaster
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
In HA the Hmasters started flipping over and we could observe more and more 
RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or 
null). Only the manual fix (forcing states for tables/regions) helped to 
recover the cluster.

I hope you apply the working patch soon.


was (Author: tr0k):
Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
 We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
 This triggered an avalanche of stuck procedures in HMaster
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
In HA the Hmasters started flipping over and we could observe more and more 
RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or 
null). Only the manual fix (forcing states for tables/regions) helped to 
recover the cluster.

I hope you apply the working patch soon.

> [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL 
> system stuck?
> -
>
> Key: HBASE-20503
> URL: https://issues.apache.org/jira/browse/HBASE-20503
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Attachments: 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch
>
>
> Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to 
> furiously open regions assigned by Master. It is importantly carrying 
> hbase:meta. Twenty minutes in, meta goes dead after an exception up out 
> AsyncFSWAL. Process had been restarted so I couldn't get a  thread dump. 
> Suspicious is we archive a WAL and we get a FNFE because we got to access WAL 
> in old location. [~Apache9] mind taking a look? Does this 

[jira] [Comment Edited] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?

2021-06-18 Thread Emil Kleszcz (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637
 ] 

Emil Kleszcz edited comment on HBASE-20503 at 6/18/21, 5:38 PM:


Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
 We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
 This triggered an avalanche of stuck procedures in HMaster
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
In HA the Hmasters started flipping over and we could observe more and more 
RITs with OPENING and CLOSING states pointing to stale RSs (old timestamps or 
null). Only the manual fix (forcing states for tables/regions) helped to 
recover the cluster.

I hope you apply the working patch soon.


was (Author: tr0k):
Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
This triggered an avalanche of stuck procedures in HMaster

{{}}
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
{{}}In HA they started flipping over and we could observe more and more RITs 
with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). 
Only the manual fix (forcing states for tables/regions) helped to recover the 
cluster.

I hope you apply the working patch soon.

> [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL 
> system stuck?
> -
>
> Key: HBASE-20503
> URL: https://issues.apache.org/jira/browse/HBASE-20503
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Attachments: 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch
>
>
> Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to 
> furiously open regions assigned by Master. It is importantly carrying 
> hbase:meta. Twenty minutes in, meta goes dead after an exception up out 
> AsyncFSWAL. Process had been restarted so I couldn't get a  thread dump. 
> Suspicious is we archive a WAL and we get a FNFE because we got to access WAL 
> in old location. [~Apache9] mind taking a look? Does this FNFE rolling kill 
> the WAL sub-system? Thanks.
> DFS complaining on file open for a few files getting blocks from remote dead 
> DNs: 

[jira] [Commented] (HBASE-20503) [AsyncFSWAL] Failed to get sync result after 300000 ms for txid=160912, WAL system stuck?

2021-06-18 Thread Emil Kleszcz (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365637#comment-17365637
 ] 

Emil Kleszcz commented on HBASE-20503:
--

Hi, we experienced the same issue in HBase 2.3.1 on one of our production 
clusters this week. This happened a few weeks after upgrading HBase from 2.2.4 
where we never observed this problem.
We run on the HDP 3.1.2. On average we have around 800 regions per RS and the 
workload was, as usual, these days.

This problem started on one of the RSs where meta region residing. We could 
observe the following in the RS log:
{code:java}
<2021-06-15T10:31:28.284+0200>  :   : 
java.io.IOException: stream already broken
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:420)
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:509)
(...)
<2021-06-15T11:11:39.744+0200>  : 
java.io.FileNotFoundException: File does not exist: /hbase/WALs/
(...)
<2021-06-15T11:15:59.241+0200>  
:   : 
java.io.IOException: stream already broken
(...)
<2021-06-15T11:39:39.986+0200>  : 
(...)
<2021-06-16T18:43:05.220+0200>  : 

{code}
Since then compaction started failing on many regions including meta. 2 days 
later we could see one RS going down...
This triggered an avalanche of stuck procedures in HMaster

{{}}
{code:java}
<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.862+0200>  : 

<2021-06-17T08:53:13.866+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 

<2021-06-17T08:53:13.867+0200>  : 
 
<2021-06-17T08:53:28.443+0200>  : {code}
{{}}In HA they started flipping over and we could observe more and more RITs 
with OPENING and CLOSING states pointing to stale RSs (old timestamps or null). 
Only the manual fix (forcing states for tables/regions) helped to recover the 
cluster.

I hope you apply the working patch soon.

> [AsyncFSWAL] Failed to get sync result after 30 ms for txid=160912, WAL 
> system stuck?
> -
>
> Key: HBASE-20503
> URL: https://issues.apache.org/jira/browse/HBASE-20503
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Attachments: 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch, 
> 0001-HBASE-20503-AsyncFSWAL-Failed-to-get-sync-result-aft.patch
>
>
> Scale test. Startup w/ 30k regions over ~250nodes. This RS is trying to 
> furiously open regions assigned by Master. It is importantly carrying 
> hbase:meta. Twenty minutes in, meta goes dead after an exception up out 
> AsyncFSWAL. Process had been restarted so I couldn't get a  thread dump. 
> Suspicious is we archive a WAL and we get a FNFE because we got to access WAL 
> in old location. [~Apache9] mind taking a look? Does this FNFE rolling kill 
> the WAL sub-system? Thanks.
> DFS complaining on file open for a few files getting blocks from remote dead 
> DNs: e.g. {{2018-04-25 10:05:21,506 WARN 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader.
> java.net.ConnectException: Connection refused}}
> AsyncFSWAL complaining: "AbstractFSWAL: Slow sync cost: 103 ms" .
> About ten minutes in, we get this:
> {code}
> 2018-04-25 10:15:16,532 WARN 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL: sync failed
> java.io.IOException: stream already broken
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:424)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:513)
>   
>   
>   
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.sync(AsyncProtobufLogWriter.java:134)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:364)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.consume(AsyncFSWAL.java:547)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2018-04-25 10:15:16,680 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Rolled WAL 
> /hbase/WALs/vc0205.halxg.cloudera.com,22101,1524675808073/vc0205.halxg.cloudera.com%2C22101%2C1524675808073.meta.1524676253923.meta
>  with entries=10819, filesize=7.57 MB; new WAL 
> 

[jira] [Commented] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]

2021-06-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365629#comment-17365629
 ] 

Hudson commented on HBASE-25984:


Results for branch branch-2.3
[build #239 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/239/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> FSHLog WAL lockup with sync future reuse [RS deadlock]
> --
>
> Key: HBASE-25984
> URL: https://issues.apache.org/jira/browse/HBASE-25984
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Critical
>  Labels: deadlock, hang
> Attachments: HBASE-25984-unit-test.patch
>
>
> We use FSHLog as the WAL implementation (branch-1 based) and under heavy load 
> we noticed the WAL system gets locked up due to a subtle bug involving racy 
> code with sync future reuse. This bug applies to all FSHLog implementations 
> across branches.
> Symptoms:
> On heavily loaded clusters with large write load we noticed that the region 
> servers are hanging abruptly with filled up handler queues and stuck MVCC 
> indicating appends/syncs not making any progress.
> {noformat}
>  WARN  [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, 
> regionName=1ce4003ab60120057734ffe367667dca}
>  WARN  [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, 
> regionName=7c441d7243f9f504194dae6bf2622631}
> {noformat}
> All the handlers are stuck waiting for the sync futures and timing out.
> {noformat}
>  java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509)
> .
> {noformat}
> Log rolling is stuck because it was unable to attain a safe point
> {noformat}
>java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799)
>  
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900)
> {noformat}
> and the Ring buffer consumer thinks that there are some outstanding syncs 
> that need to finish..
> {noformat}
>   
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857)
> {noformat}
> On the other hand, SyncRunner threads are idle and just waiting for work 
> implying that there are no pending SyncFutures that need to be run
> {noformat}
>sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> Overall the WAL system is dead locked and could make no progress until it was 
> aborted. I got to the bottom of this issue and have a patch that can fix it 
> (more details in the comments due to word limit in the 

[jira] [Commented] (HBASE-11408) "multiple SLF4J bindings" warning messages when running HBase shell

2021-06-18 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365607#comment-17365607
 ] 

Stefan Miklosovic commented on HBASE-11408:
---

It helps when you set this in hbase-env.sh

 

 
{code:java}
# Tell HBase whether it should include Hadoop's lib when start up,
# the default value is false,means that includes Hadoop's lib.
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
{code}
 

This will exclude Hadoops CP to be added to HBase CP.

I am not sure what is having Hadoops CP good for ... it seems to me like all 
works as before

> "multiple SLF4J bindings" warning messages when running HBase shell
> ---
>
> Key: HBASE-11408
> URL: https://issues.apache.org/jira/browse/HBASE-11408
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.2, 0.98.3
>Reporter: Duo Xu
>Priority: Minor
>
> When running hbase shell, we saw warnings like this:
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/C:/apps/dist/hbase-0.98.0.2.1.3.0-1928-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/C:/apps/dist/hadoop-2.4.0.2.1.3.0-1928/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-11408) "multiple SLF4J bindings" warning messages when running HBase shell

2021-06-18 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365607#comment-17365607
 ] 

Stefan Miklosovic edited comment on HBASE-11408 at 6/18/21, 4:50 PM:
-

It helps when you set this in hbase-env.sh (holds for 2.2.6)
{code:java}
# Tell HBase whether it should include Hadoop's lib when start up,
# the default value is false,means that includes Hadoop's lib.
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
{code}
 This will exclude Hadoops CP to be added to HBase CP.

I am not sure what is having Hadoops CP good for ... it seems to me like all 
works as before


was (Author: stefan.miklosovic):
It helps when you set this in hbase-env.sh

 

 
{code:java}
# Tell HBase whether it should include Hadoop's lib when start up,
# the default value is false,means that includes Hadoop's lib.
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
{code}
 

This will exclude Hadoops CP to be added to HBase CP.

I am not sure what is having Hadoops CP good for ... it seems to me like all 
works as before

> "multiple SLF4J bindings" warning messages when running HBase shell
> ---
>
> Key: HBASE-11408
> URL: https://issues.apache.org/jira/browse/HBASE-11408
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.2, 0.98.3
>Reporter: Duo Xu
>Priority: Minor
>
> When running hbase shell, we saw warnings like this:
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/C:/apps/dist/hbase-0.98.0.2.1.3.0-1928-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/C:/apps/dist/hadoop-2.4.0.2.1.3.0-1928/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26005) Update ref guide about the EOL for 2.2.x

2021-06-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26005.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Merged to master.

Thanks [~GeorryHuang] for contributing.

> Update ref guide about the EOL for 2.2.x
> 
>
> Key: HBASE-26005
> URL: https://issues.apache.org/jira/browse/HBASE-26005
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Duo Zhang
>Assignee: Zhuoyue Huang
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> For example, remove the release manager for 2.2.x, and also update the 
> compatibility matrix with hadoop, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26005) Update ref guide about the EOL for 2.2.x

2021-06-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-26005:
--
Fix Version/s: 3.0.0-alpha-1

> Update ref guide about the EOL for 2.2.x
> 
>
> Key: HBASE-26005
> URL: https://issues.apache.org/jira/browse/HBASE-26005
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Duo Zhang
>Assignee: Zhuoyue Huang
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> For example, remove the release manager for 2.2.x, and also update the 
> compatibility matrix with hadoop, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26015) Should implement getRegionServers(boolean) method in AsyncAdmin

2021-06-18 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-26015:
-

 Summary: Should implement getRegionServers(boolean) method in 
AsyncAdmin
 Key: HBASE-26015
 URL: https://issues.apache.org/jira/browse/HBASE-26015
 Project: HBase
  Issue Type: Task
  Components: Admin, Client
Reporter: Duo Zhang


We have this method in Admin but not in AsyncAdmin, we should align these two 
interfaces.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26014) ServiceLoader usages should not be tied to Thread Context Classloader

2021-06-18 Thread Andrei Lopukhov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Lopukhov updated HBASE-26014:

Summary: ServiceLoader usages should not be tied to Thread Context 
Classloader  (was: ServiceLoader usages are tied to Thread Context Classloader)

> ServiceLoader usages should not be tied to Thread Context Classloader
> -
>
> Key: HBASE-26014
> URL: https://issues.apache.org/jira/browse/HBASE-26014
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrei Lopukhov
>Priority: Major
>
> Classes which uses ServiceLoader facility does not specify ClassLoader to use.
> For hbase-client 2.4.1 they are (at least):
>  * SaslClientAuthenticationProviders
>  * MetricRegistries
> When hbase libraries are loaded dynamically and Thread Context Classloader is 
> not set SaslClientAuthenticationProviders instantiation failes becuse it 
> can't find default providers.
> Some proposals for classloader selection strategy (usage dependent I guess):
>  * Use classloader specified in Configuration instance.
>  * Use classloader which loaded specific hbase class
>  * Combine them: use classloader from Configuration if present and fallback 
> to classloader which loaded specific hbase class
> Real world requirement example: currently we migrating from hbase 1 to hbase 
> 2. For better compatibility and smooth migration we try to build abstraction 
> around hbase client libraries and isolate them with custom classloaders. To 
> workaround problems with context classloader we must either wrap all calls to 
> use proper contextclassloader or explicitly trigger initialization of 
> affected classes (SaslClientAuthenticationProviders) wich are marked private 
> under proper context classloader. It would be better if hbase client will 
> take care of it itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26014) ServiceLoader usages are tied to Thread Context Classloader

2021-06-18 Thread Andrei Lopukhov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Lopukhov updated HBASE-26014:

Description: 
Classes which uses ServiceLoader facility does not specify ClassLoader to use.

For hbase-client 2.4.1 they are (at least):
 * SaslClientAuthenticationProviders
 * MetricRegistries

When hbase libraries are loaded dynamically and Thread Context Classloader is 
not set SaslClientAuthenticationProviders instantiation failes becuse it can't 
find default providers.

Some proposals for classloader selection strategy (usage dependent I guess):
 * Use classloader specified in Configuration instance.
 * Use classloader which loaded specific hbase class
 * Combine them: use classloader from Configuration if present and fallback to 
classloader which loaded specific hbase class

Real world requirement example: currently we migrating from hbase 1 to hbase 2. 
For better compatibility and smooth migration we try to build abstraction 
around hbase client libraries and isolate them with custom classloaders. To 
workaround problems with context classloader we must either wrap all calls to 
use proper contextclassloader or explicitly trigger initialization of affected 
classes (SaslClientAuthenticationProviders) wich are marked private under 
proper context classloader. It would be better if hbase client will take care 
of it itself.

  was:
Classes which uses ServiceLoader facility does not specify ClassLoader to use.

For hbase-client 2.4.1 they are (at least):
 * SaslClientAuthenticationProviders
 * MetricRegistries

When hbase libraries are loaded dynamically and Thread Context Classloader is 
not set SaslClientAuthenticationProviders instantiation failes becuse it can't 
find default providers.

Some proposals for classloader selection strategy (usage dependent I guess):
 * Use classloader specified in Configuration instance.
 * Use classloader which loaded specific hbase class
 * Combine them: use classloader from Configuration if present and fallback to 
classloader which loaded specific hbase class

Real world example: currently we migrating from hbase 1 to hbase 2. For better 
compatibility and smooth migration we try to build abstraction around hbase 
client libraries and isolate them with custom classloaders. 


> ServiceLoader usages are tied to Thread Context Classloader
> ---
>
> Key: HBASE-26014
> URL: https://issues.apache.org/jira/browse/HBASE-26014
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrei Lopukhov
>Priority: Major
>
> Classes which uses ServiceLoader facility does not specify ClassLoader to use.
> For hbase-client 2.4.1 they are (at least):
>  * SaslClientAuthenticationProviders
>  * MetricRegistries
> When hbase libraries are loaded dynamically and Thread Context Classloader is 
> not set SaslClientAuthenticationProviders instantiation failes becuse it 
> can't find default providers.
> Some proposals for classloader selection strategy (usage dependent I guess):
>  * Use classloader specified in Configuration instance.
>  * Use classloader which loaded specific hbase class
>  * Combine them: use classloader from Configuration if present and fallback 
> to classloader which loaded specific hbase class
> Real world requirement example: currently we migrating from hbase 1 to hbase 
> 2. For better compatibility and smooth migration we try to build abstraction 
> around hbase client libraries and isolate them with custom classloaders. To 
> workaround problems with context classloader we must either wrap all calls to 
> use proper contextclassloader or explicitly trigger initialization of 
> affected classes (SaslClientAuthenticationProviders) wich are marked private 
> under proper context classloader. It would be better if hbase client will 
> take care of it itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26014) ServiceLoader usages are tied to Thread Context Classloader

2021-06-18 Thread Andrei Lopukhov (Jira)
Andrei Lopukhov created HBASE-26014:
---

 Summary: ServiceLoader usages are tied to Thread Context 
Classloader
 Key: HBASE-26014
 URL: https://issues.apache.org/jira/browse/HBASE-26014
 Project: HBase
  Issue Type: Improvement
Reporter: Andrei Lopukhov


Classes which uses ServiceLoader facility does not specify ClassLoader to use.

For hbase-client 2.4.1 they are (at least):
 * SaslClientAuthenticationProviders
 * MetricRegistries

When hbase libraries are loaded dynamically and Thread Context Classloader is 
not set SaslClientAuthenticationProviders instantiation failes becuse it can't 
find default providers.

Some proposals for classloader selection strategy (usage dependent I guess):
 * Use classloader specified in Configuration instance.
 * Use classloader which loaded specific hbase class
 * Combine them: use classloader from Configuration if present and fallback to 
classloader which loaded specific hbase class

Real world example: currently we migrating from hbase 1 to hbase 2. For better 
compatibility and smooth migration we try to build abstraction around hbase 
client libraries and isolate them with custom classloaders. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677

2021-06-18 Thread Yutong Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26013:

Description: After HBASE-25677, Server+table counters on each scan are 
extracted from #nextRaw to rsServices scan. In this case, the get operation 
will not count the read rows. So that the readRows metrics becomes zero. Should 
add counter in metricsUpdateForGet.  (was: After HBASE-25677, Server+table 
counters on each scan are extracted from #nextRaw to rsServices level scan. In 
this case, the get operation will not count the read rows. So that the readRows 
metrics becomes zero. Should add counter in metricsUpdateForGet.)

> Get operations readRows metrics becomes zero after HBASE-25677
> --
>
> Key: HBASE-26013
> URL: https://issues.apache.org/jira/browse/HBASE-26013
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> After HBASE-25677, Server+table counters on each scan are extracted from 
> #nextRaw to rsServices scan. In this case, the get operation will not count 
> the read rows. So that the readRows metrics becomes zero. Should add counter 
> in metricsUpdateForGet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677

2021-06-18 Thread Yutong Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26013:

Description: After HBASE-25677, Server+table counters on each scan are 
extracted from #nextRaw to rsServices level scan. In this case, the get 
operation will not count the read rows. So that the readRows metrics becomes 
zero. Should add counter in metricsUpdateForGet.  (was: 
)

> Get operations readRows metrics becomes zero after HBASE-25677
> --
>
> Key: HBASE-26013
> URL: https://issues.apache.org/jira/browse/HBASE-26013
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> After HBASE-25677, Server+table counters on each scan are extracted from 
> #nextRaw to rsServices level scan. In this case, the get operation will not 
> count the read rows. So that the readRows metrics becomes zero. Should add 
> counter in metricsUpdateForGet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677

2021-06-18 Thread Yutong Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26013:

Description: 



  was:
After HBASE-25677, the method in HRegion.java 

{code:java}
void metricsUpdateForGet(List results, long before) {
if (this.metricsRegion != null) {
  this.metricsRegion.updateGet(EnvironmentEdgeManager.currentTime() - 
before);
}
}
{code}

does not update the regionserver level metrics.  



> Get operations readRows metrics becomes zero after HBASE-25677
> --
>
> Key: HBASE-26013
> URL: https://issues.apache.org/jira/browse/HBASE-26013
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677

2021-06-18 Thread Yutong Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yutong Xiao updated HBASE-26013:

Description: 
After HBASE-25677, the method in HRegion.java 

{code:java}
void metricsUpdateForGet(List results, long before) {
if (this.metricsRegion != null) {
  this.metricsRegion.updateGet(EnvironmentEdgeManager.currentTime() - 
before);
}
}
{code}

does not update the regionserver level metrics.  


> Get operations readRows metrics becomes zero after HBASE-25677
> --
>
> Key: HBASE-26013
> URL: https://issues.apache.org/jira/browse/HBASE-26013
> Project: HBase
>  Issue Type: Bug
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
>
> After HBASE-25677, the method in HRegion.java 
> {code:java}
> void metricsUpdateForGet(List results, long before) {
> if (this.metricsRegion != null) {
>   this.metricsRegion.updateGet(EnvironmentEdgeManager.currentTime() - 
> before);
> }
> }
> {code}
> does not update the regionserver level metrics.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26013) Get operations readRows metrics becomes zero after HBASE-25677

2021-06-18 Thread Yutong Xiao (Jira)
Yutong Xiao created HBASE-26013:
---

 Summary: Get operations readRows metrics becomes zero after 
HBASE-25677
 Key: HBASE-26013
 URL: https://issues.apache.org/jira/browse/HBASE-26013
 Project: HBase
  Issue Type: Bug
Reporter: Yutong Xiao
Assignee: Yutong Xiao






--
This message was sent by Atlassian Jira
(v8.3.4#803005)