[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-11 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362233#comment-17362233
 ] 

Michael Stack commented on HBASE-25998:
---

The numbers look nice [~bharathv]  (Not in a place to try locally – OOO). Patch 
looks good. Your safety check passes?

> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Attachments: monitor-overhead-1.png, monitor-overhead-2.png
>
>
> While working on HBASE-25984, I noticed some weird frames in the flame graphs 
> around monitor entry exit consuming a lot of CPU cycles (see attached 
> images). Noticed that the synchronization there is too coarse grained and 
> sometimes unnecessary. I did a simple patch that switched to a reentrant lock 
> based synchronization with condition variable rather than a busy wait and 
> that showed 70-80% increased throughput in WAL PE. Seems too good to be 
> true.. (more details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25391) Flush directly into data directory, skip rename when committing flush

2021-06-11 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362232#comment-17362232
 ] 

Michael Stack commented on HBASE-25391:
---

[~wchevreuil] Asked for a RN because the parent was partly done then put aside. 
It was then picked up, so trying to follow what is being added now.

The RN helps. Thanks.

One (too-late) comment is that the name 'PersistedEngineStoreFlusher' seems 
odd. Its a flusher – an action – but its 'persisted' which is a final state... 
the two parts of the name argue w/ each other. Then, when would we do a store 
flusher that did not 'persist'? Should it be DirectStoreFlusherEngine and 
DirectStoreFlushContext – i.e. no intermediate (indirect) tmp file?

> Flush directly into data directory, skip rename when committing flush
> -
>
> Key: HBASE-25391
> URL: https://issues.apache.org/jira/browse/HBASE-25391
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Wellington Chevreuil
>Priority: Major
>
> {color:#00}When flushing memstore snapshot to HFile, we write it directly 
> to the data directory.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25894) Improve the performance for region load and region count related cost functions

2021-06-11 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362221#comment-17362221
 ] 

Hudson commented on HBASE-25894:


Results for branch branch-2.4
[build #141 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Improve the performance for region load and region count related cost 
> functions
> ---
>
> Key: HBASE-25894
> URL: https://issues.apache.org/jira/browse/HBASE-25894
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer, Performance
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> For a large cluster, we have a lot of regions, so computing the whole cost 
> will be expensive. We should try to remove the unnecessary calculation as 
> much as possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25947) Backport 'HBASE-25894 Improve the performance for region load and region count related cost functions' to branch-2.4 and branch-2.3

2021-06-11 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362220#comment-17362220
 ] 

Hudson commented on HBASE-25947:


Results for branch branch-2.4
[build #141 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Backport 'HBASE-25894 Improve the performance for region load and region 
> count related cost functions' to branch-2.4 and branch-2.3
> ---
>
> Key: HBASE-25947
> URL: https://issues.apache.org/jira/browse/HBASE-25947
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer, Performance
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.3.6, 2.4.5
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25993) Make excluded SSL cipher suites configurable for all Web UIs

2021-06-11 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362219#comment-17362219
 ] 

Hudson commented on HBASE-25993:


Results for branch branch-2.4
[build #141 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/141/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Make excluded SSL cipher suites configurable for all Web UIs
> 
>
> Key: HBASE-25993
> URL: https://issues.apache.org/jira/browse/HBASE-25993
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-1, 2.2.7, 2.5.0, 2.3.5, 2.4.4
>Reporter: Mate Szalay-Beko
>Assignee: Mate Szalay-Beko
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5
>
>
> When starting a jetty http server, one can explicitly exclude certain 
> (unsecure) SSL cipher suites. This can be especially important, when the 
> HBase cluster needs to be compliant with security regulations (e.g. FIPS).
> Currently it is possible to set the excluded ciphers for the ThriftServer 
> ("hbase.thrift.ssl.exclude.cipher.suites") or for the RestServer 
> ("hbase.rest.ssl.exclude.cipher.suites"), but one can not configure it for 
> the regular InfoServer started by e.g. the master or region servers.
> In this commit I want to introduce a new configuration 
> "ssl.server.exclude.cipher.list" to configure the excluded cipher suites for 
> the http server started by the InfoServer. This parameter has the same name 
> and will work in the same way, as it was already implemented in hadoop (e.g. 
> for hdfs/yarn). See: HADOOP-12668, HADOOP-14341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-25975) Row commit sequencer

2021-06-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362215#comment-17362215
 ] 

Andrew Kyle Purtell edited comment on HBASE-25975 at 6/12/21, 1:50 AM:
---

The microbenchmark is going to be very helpful. Right now I have it hacked into 
TestHRegion but will move it out. See this gist: 
[link|https://gist.github.com/apurtell/eb1122f74b0a9f0305f0c8c575b2fc21]

As of [c6d2a11b|https://github.com/apache/hbase/pull/3360/commits/c6d2a11b] 
performance is better especially as the number of rows processed in the 
previous tick increases, by simply allocating a new CSLS for the next tick 
rather than clear()ing.

Etc.
{noformat}
  1 threads1 non-contended rows 100 iterations in  248262978 ns  (2.482629 
ms/op) 
  2 threads1 non-contended rows 100 iterations in  119657896 ns  (1.196578 
ms/op) 
  4 threads1 non-contended rows 100 iterations in  117589133 ns  (1.175891 
ms/op) 
  8 threads1 non-contended rows 100 iterations in  127482269 ns  (1.274822 
ms/op) 
 16 threads1 non-contended rows 100 iterations in  120375922 ns  (1.203759 
ms/op) 
 32 threads1 non-contended rows 100 iterations in  117154493 ns  (1.171544 
ms/op) 
  1 threads   10 non-contended rows 100 iterations in  123248732 ns  (1.232487 
ms/op) 
  2 threads   10 non-contended rows 100 iterations in  122647177 ns  (1.226471 
ms/op) 
  4 threads   10 non-contended rows 100 iterations in  127126968 ns  (1.271269 
ms/op) 
  8 threads   10 non-contended rows 100 iterations in  133759033 ns  (1.337590 
ms/op) 
 16 threads   10 non-contended rows 100 iterations in  133973857 ns  (1.339738 
ms/op) 
 32 threads   10 non-contended rows 100 iterations in  126716770 ns  (1.267167 
ms/op) 
  1 threads  100 non-contended rows 100 iterations in  127032261 ns  (1.270322 
ms/op) 
  2 threads  100 non-contended rows 100 iterations in  128259658 ns  (1.282596 
ms/op) 
  4 threads  100 non-contended rows 100 iterations in  120013005 ns  (1.200130 
ms/op) 
  8 threads  100 non-contended rows 100 iterations in  126168665 ns  (1.261686 
ms/op) 
 16 threads  100 non-contended rows 100 iterations in  138842281 ns  (1.388422 
ms/op) 
 32 threads  100 non-contended rows 100 iterations in  266622073 ns  (2.666220 
ms/op) 
  1 threads 1000 non-contended rows 100 iterations in  224824016 ns  (2.248240 
ms/op) 
  2 threads 1000 non-contended rows 100 iterations in  276253087 ns  (2.762530 
ms/op) 
  4 threads 1000 non-contended rows 100 iterations in  373552155 ns  (3.735521 
ms/op) 
  8 threads 1000 non-contended rows 100 iterations in  622022490 ns  (6.220224 
ms/op) 
 16 threads 1000 non-contended rows 100 iterations in 1289010748 ns (12.890107 
ms/op) 
 32 threads 1000 non-contended rows 100 iterations in 2449270127 ns (24.492701 
ms/op) 

  1 threads1 contended rows 100 iterations in  119867953 ns  (1.198679 
ms/op) 
  2 threads1 contended rows 100 iterations in  225605406 ns  (2.256054 
ms/op) 
  4 threads1 contended rows 100 iterations in  427749326 ns  (4.277493 
ms/op) 
  8 threads1 contended rows 100 iterations in  776111781 ns  (7.761117 
ms/op) 
 16 threads1 contended rows 100 iterations in 1638138512 ns (16.381385 
ms/op) 
 32 threads1 contended rows 100 iterations in 3221263267 ns (32.212632 
ms/op) 
  1 threads   10 contended rows 100 iterations in  122263470 ns  (1.222634 
ms/op) 
  2 threads   10 contended rows 100 iterations in  225890471 ns  (2.258904 
ms/op) 
  4 threads   10 contended rows 100 iterations in  423801468 ns  (4.238014 
ms/op) 
  8 threads   10 contended rows 100 iterations in  819573522 ns  (8.195735 
ms/op) 
 16 threads   10 contended rows 100 iterations in 1604154859 ns (16.041548 
ms/op) 
 32 threads   10 contended rows 100 iterations in 3127778875 ns (31.277788 
ms/op) 
  1 threads  100 contended rows 100 iterations in  116046683 ns  (1.160466 
ms/op) 
  2 threads  100 contended rows 100 iterations in  215477979 ns  (2.154779 
ms/op) 
  4 threads  100 contended rows 100 iterations in  411627258 ns  (4.116272 
ms/op) 
  8 threads  100 contended rows 100 iterations in  806653481 ns  (8.066534 
ms/op) 
 16 threads  100 contended rows 100 iterations in 1600262862 ns (16.002628 
ms/op) 
 32 threads  100 contended rows 100 iterations in 3179850096 ns (31.798500 
ms/op) 
  1 threads 1000 contended rows 100 iterations in  231174490 ns  (2.311744 
ms/op) 
  2 threads 1000 contended rows 100 iterations in  294631204 ns  (2.946312 
ms/op) 
  4 threads 1000 contended rows 100 iterations in  513858509 ns  (5.138585 
ms/op) 
  8 threads 1000 contended rows 100 iterations in  886817867 ns  (8.868178 
ms/op) 
 16 threads 1000 contended rows 100 iterations in 1745257920 ns (17.452579 
ms/op) 
 32 threads 1000 contended rows 100 iterations in 3404472773 ns (34.044727 
ms/op) 
{noformat}
 


was (Author: apurtell):
The microbenchmark is going to be very helpful. Right 

[jira] [Commented] (HBASE-25975) Row commit sequencer

2021-06-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362215#comment-17362215
 ] 

Andrew Kyle Purtell commented on HBASE-25975:
-

The microbenchmark is going to be very helpful. Right now I have it hacked into 
TestHRegion but will move it out. See this gist: 
[link|https://gist.github.com/apurtell/eb1122f74b0a9f0305f0c8c575b2fc21]]

As of [c6d2a11b|https://github.com/apache/hbase/pull/3360/commits/c6d2a11b] 
performance is better especially as the number of rows processed in the 
previous tick increases, by simply allocating a new CSLS for the next tick 
rather than clear()ing.

Etc.
{noformat}
  1 threads1 non-contended rows 100 iterations in  248262978 ns  (2.482629 
ms/op) 
  2 threads1 non-contended rows 100 iterations in  119657896 ns  (1.196578 
ms/op) 
  4 threads1 non-contended rows 100 iterations in  117589133 ns  (1.175891 
ms/op) 
  8 threads1 non-contended rows 100 iterations in  127482269 ns  (1.274822 
ms/op) 
 16 threads1 non-contended rows 100 iterations in  120375922 ns  (1.203759 
ms/op) 
 32 threads1 non-contended rows 100 iterations in  117154493 ns  (1.171544 
ms/op) 
  1 threads   10 non-contended rows 100 iterations in  123248732 ns  (1.232487 
ms/op) 
  2 threads   10 non-contended rows 100 iterations in  122647177 ns  (1.226471 
ms/op) 
  4 threads   10 non-contended rows 100 iterations in  127126968 ns  (1.271269 
ms/op) 
  8 threads   10 non-contended rows 100 iterations in  133759033 ns  (1.337590 
ms/op) 
 16 threads   10 non-contended rows 100 iterations in  133973857 ns  (1.339738 
ms/op) 
 32 threads   10 non-contended rows 100 iterations in  126716770 ns  (1.267167 
ms/op) 
  1 threads  100 non-contended rows 100 iterations in  127032261 ns  (1.270322 
ms/op) 
  2 threads  100 non-contended rows 100 iterations in  128259658 ns  (1.282596 
ms/op) 
  4 threads  100 non-contended rows 100 iterations in  120013005 ns  (1.200130 
ms/op) 
  8 threads  100 non-contended rows 100 iterations in  126168665 ns  (1.261686 
ms/op) 
 16 threads  100 non-contended rows 100 iterations in  138842281 ns  (1.388422 
ms/op) 
 32 threads  100 non-contended rows 100 iterations in  266622073 ns  (2.666220 
ms/op) 
  1 threads 1000 non-contended rows 100 iterations in  224824016 ns  (2.248240 
ms/op) 
  2 threads 1000 non-contended rows 100 iterations in  276253087 ns  (2.762530 
ms/op) 
  4 threads 1000 non-contended rows 100 iterations in  373552155 ns  (3.735521 
ms/op) 
  8 threads 1000 non-contended rows 100 iterations in  622022490 ns  (6.220224 
ms/op) 
 16 threads 1000 non-contended rows 100 iterations in 1289010748 ns (12.890107 
ms/op) 
 32 threads 1000 non-contended rows 100 iterations in 2449270127 ns (24.492701 
ms/op) 

  1 threads1 contended rows 100 iterations in  119867953 ns  (1.198679 
ms/op) 
  2 threads1 contended rows 100 iterations in  225605406 ns  (2.256054 
ms/op) 
  4 threads1 contended rows 100 iterations in  427749326 ns  (4.277493 
ms/op) 
  8 threads1 contended rows 100 iterations in  776111781 ns  (7.761117 
ms/op) 
 16 threads1 contended rows 100 iterations in 1638138512 ns (16.381385 
ms/op) 
 32 threads1 contended rows 100 iterations in 3221263267 ns (32.212632 
ms/op) 
  1 threads   10 contended rows 100 iterations in  122263470 ns  (1.222634 
ms/op) 
  2 threads   10 contended rows 100 iterations in  225890471 ns  (2.258904 
ms/op) 
  4 threads   10 contended rows 100 iterations in  423801468 ns  (4.238014 
ms/op) 
  8 threads   10 contended rows 100 iterations in  819573522 ns  (8.195735 
ms/op) 
 16 threads   10 contended rows 100 iterations in 1604154859 ns (16.041548 
ms/op) 
 32 threads   10 contended rows 100 iterations in 3127778875 ns (31.277788 
ms/op) 
  1 threads  100 contended rows 100 iterations in  116046683 ns  (1.160466 
ms/op) 
  2 threads  100 contended rows 100 iterations in  215477979 ns  (2.154779 
ms/op) 
  4 threads  100 contended rows 100 iterations in  411627258 ns  (4.116272 
ms/op) 
  8 threads  100 contended rows 100 iterations in  806653481 ns  (8.066534 
ms/op) 
 16 threads  100 contended rows 100 iterations in 1600262862 ns (16.002628 
ms/op) 
 32 threads  100 contended rows 100 iterations in 3179850096 ns (31.798500 
ms/op) 
  1 threads 1000 contended rows 100 iterations in  231174490 ns  (2.311744 
ms/op) 
  2 threads 1000 contended rows 100 iterations in  294631204 ns  (2.946312 
ms/op) 
  4 threads 1000 contended rows 100 iterations in  513858509 ns  (5.138585 
ms/op) 
  8 threads 1000 contended rows 100 iterations in  886817867 ns  (8.868178 
ms/op) 
 16 threads 1000 contended rows 100 iterations in 1745257920 ns (17.452579 
ms/op) 
 32 threads 1000 contended rows 100 iterations in 3404472773 ns (34.044727 
ms/op) 
{noformat}
 

> Row commit sequencer
> 
>
> Key: HBASE-25975
> URL: 

[jira] [Comment Edited] (HBASE-25975) Row commit sequencer

2021-06-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362206#comment-17362206
 ] 

Andrew Kyle Purtell edited comment on HBASE-25975 at 6/12/21, 1:48 AM:
---

As of 
[c4cf83ce|https://github.com/apache/hbase/pull/3360/commits/c4cf83ce469a029c6d37ecfe4e39a006bcc7376e]
 I have something that passes TestHRegion and at least the first group of 
hbase-server tests (Tests run: 1083, Failures: 0, Errors: 0, Skipped: 4). I got 
bored and hit ^C after a couple of hours somewhere in the second group. 

Attached as [^HBASE-25975-c4cf83ce.pdf] is a simple benchmark of the 
implementation as of this commit. The test creates an HRegion, constructs the 
requisite number of Puts for the test iteration, starts up the desired number 
of threads waiting on a countdown latch, then releases the latch. The threads 
submit their batch of Puts as quickly as possible, 100 times. Between each run 
the region is flushed and compacted. The total execution time is measured from 
latch release until the last thread terminates. Contended and uncontended cases 
are measured. An average time in milliseconds per operation is calculated by 
dividing the running time by the number of operations each thread attempts in 
parallel. Overhead is calculated by taking the measurements from the 0% 
contention case and subtracting a baseline measurement of the same activity 
without any changes applied. It is encouraging that even at this early stage 
for most cases the measured overhead is less than 1 millisecond per operation. 
Where that was not the case is highlighted by numbers in red. The overhead is 
flat with respect to number of active real CPU threads and appears to amortize 
over batches. There are obvious optimization opportunities for the large batch 
high contention cases, though. 


was (Author: apurtell):
As of 
[c4cf83ce|https://github.com/apache/hbase/pull/3360/commits/c4cf83ce469a029c6d37ecfe4e39a006bcc7376e]
 I have something that passes TestHRegion and at least the first group of 
hbase-server tests (Tests run: 1083, Failures: 0, Errors: 0, Skipped: 4). I got 
bored and hit ^C after a couple of hours somewhere in the second group. 

Attached as [^HBASE-25975-c4cf83ce.pdf] is a simple benchmark of the 
implementation as of this commit. The test creates an HRegion, constructs the 
requisite number of Puts for the test iteration, starts up the desired number 
of threads waiting on a countdown latch, then releases the latch. The threads 
submit their batch of Puts as quickly as possible, 1000 times. Between each run 
the region is flushed and compacted. The total execution time is measured from 
latch release until the last thread terminates. Contended and uncontended cases 
are measured. An average time in milliseconds per operation is calculated by 
dividing the running time by the number of operations each thread attempts in 
parallel. Overhead is calculated by taking the measurements from the 0% 
contention case and subtracting a baseline measurement of the same activity 
without any changes applied. It is encouraging that even at this early stage 
for most cases the measured overhead is less than 1 millisecond per operation. 
Where that was not the case is highlighted by numbers in red. The overhead is 
flat with respect to number of active real CPU threads and appears to amortize 
over batches. There are obvious optimization opportunities for the large batch 
high contention cases, though. 

> Row commit sequencer
> 
>
> Key: HBASE-25975
> URL: https://issues.apache.org/jira/browse/HBASE-25975
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
> Attachments: HBASE-25975-c4cf83ce.pdf
>
>
> Use a row commit sequencer in HRegion to ensure that only the operations that 
> mutate disjoint sets of rows are able to commit within the same clock tick. 
> This maintains the invariant that more than one mutation to a given row will 
> never be committed in the same clock tick.
> Callers will first acquire row locks for the row(s) the pending mutation will 
> mutate. Then they will use RowCommitSequencer.getRowSequence to ensure that 
> the set of rows about to be mutated do not overlap with those for any other 
> pending mutations in the current clock tick. If an overlap is identified, 
> getRowSequence will yield and loop until there is no longer an overlap and 
> the caller's pending mutation can succeed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-25975) Row commit sequencer

2021-06-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362206#comment-17362206
 ] 

Andrew Kyle Purtell edited comment on HBASE-25975 at 6/12/21, 1:06 AM:
---

As of 
[c4cf83ce|https://github.com/apache/hbase/pull/3360/commits/c4cf83ce469a029c6d37ecfe4e39a006bcc7376e]
 I have something that passes TestHRegion and at least the first group of 
hbase-server tests (Tests run: 1083, Failures: 0, Errors: 0, Skipped: 4). I got 
bored and hit ^C after a couple of hours somewhere in the second group. 

Attached as [^HBASE-25975-c4cf83ce.pdf] is a simple benchmark of the 
implementation as of this commit. The test creates an HRegion, constructs the 
requisite number of Puts for the test iteration, starts up the desired number 
of threads waiting on a countdown latch, then releases the latch. The threads 
submit their batch of Puts as quickly as possible, 1000 times. Between each run 
the region is flushed and compacted. The total execution time is measured from 
latch release until the last thread terminates. Contended and uncontended cases 
are measured. An average time in milliseconds per operation is calculated by 
dividing the running time by the number of operations each thread attempts in 
parallel. Overhead is calculated by taking the measurements from the 0% 
contention case and subtracting a baseline measurement of the same activity 
without any changes applied. It is encouraging that even at this early stage 
for most cases the measured overhead is less than 1 millisecond per operation. 
Where that was not the case is highlighted by numbers in red. The overhead is 
flat with respect to number of active real CPU threads and appears to amortize 
over batches. There are obvious optimization opportunities for the large batch 
high contention cases, though. 


was (Author: apurtell):
As of 
[c4cf83ce|https://github.com/apache/hbase/pull/3360/commits/c4cf83ce469a029c6d37ecfe4e39a006bcc7376e]
 I have something that passes TestHRegion and at least the first group of 
hbase-server tests (Tests run: 1083, Failures: 0, Errors: 0, Skipped: 4). I got 
bored and hit ^C after a couple of hours somewhere in the second group. 

Attached as [^HBASE-25975-c4cf83ce.pdf] is a simple benchmark of the 
implementation as of this commit. The test creates an HRegion, constructs the 
requisite number of Puts, starts up the desired number of threads waiting on a 
countdown latch, then releases the latch. The threads submit their batch of 
Puts as quickly as possible, 1000 times. The total execution time is measured 
from latch release until the last thread terminates. An average time in 
milliseconds per operation is calculated by dividing the running time by the 
number of operations each thread attempts in parallel. Overhead is calculated 
by taking the measurements from the 0% contention case and subtracting a 
baseline measurement of the same activity without any changes applied. It is 
encouraging that even at this early stage for most cases the measured overhead 
is less than 1 millisecond per operation. Where that was not the case is 
highlighted by numbers in red. The overhead is flat with respect to number of 
active real CPU threads and appears to amortize over batches. There are obvious 
optimization opportunities for the large batch high contention cases, though. 

> Row commit sequencer
> 
>
> Key: HBASE-25975
> URL: https://issues.apache.org/jira/browse/HBASE-25975
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
> Attachments: HBASE-25975-c4cf83ce.pdf
>
>
> Use a row commit sequencer in HRegion to ensure that only the operations that 
> mutate disjoint sets of rows are able to commit within the same clock tick. 
> This maintains the invariant that more than one mutation to a given row will 
> never be committed in the same clock tick.
> Callers will first acquire row locks for the row(s) the pending mutation will 
> mutate. Then they will use RowCommitSequencer.getRowSequence to ensure that 
> the set of rows about to be mutated do not overlap with those for any other 
> pending mutations in the current clock tick. If an overlap is identified, 
> getRowSequence will yield and loop until there is no longer an overlap and 
> the caller's pending mutation can succeed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25975) Row commit sequencer

2021-06-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362206#comment-17362206
 ] 

Andrew Kyle Purtell commented on HBASE-25975:
-

As of 
[c4cf83ce|https://github.com/apache/hbase/pull/3360/commits/c4cf83ce469a029c6d37ecfe4e39a006bcc7376e]
 I have something that passes TestHRegion and at least the first group of 
hbase-server tests (Tests run: 1083, Failures: 0, Errors: 0, Skipped: 4). I got 
bored and hit ^C after a couple of hours somewhere in the second group. 

Attached as [^HBASE-25975-c4cf83ce.pdf] is a simple benchmark of the 
implementation as of this commit. The test creates an HRegion, constructs the 
requisite number of Puts, starts up the desired number of threads waiting on a 
countdown latch, then releases the latch. The threads submit their batch of 
Puts as quickly as possible, 1000 times. The total execution time is measured 
from latch release until the last thread terminates. An average time in 
milliseconds per operation is calculated by dividing the running time by the 
number of operations each thread attempts in parallel. Overhead is calculated 
by taking the measurements from the 0% contention case and subtracting a 
baseline measurement of the same activity without any changes applied. It is 
encouraging that even at this early stage for most cases the measured overhead 
is less than 1 millisecond per operation. Where that was not the case is 
highlighted by numbers in red. The overhead is flat with respect to number of 
active real CPU threads and appears to amortize over batches. There are obvious 
optimization opportunities for the large batch high contention cases, though. 

> Row commit sequencer
> 
>
> Key: HBASE-25975
> URL: https://issues.apache.org/jira/browse/HBASE-25975
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
> Attachments: HBASE-25975-c4cf83ce.pdf
>
>
> Use a row commit sequencer in HRegion to ensure that only the operations that 
> mutate disjoint sets of rows are able to commit within the same clock tick. 
> This maintains the invariant that more than one mutation to a given row will 
> never be committed in the same clock tick.
> Callers will first acquire row locks for the row(s) the pending mutation will 
> mutate. Then they will use RowCommitSequencer.getRowSequence to ensure that 
> the set of rows about to be mutated do not overlap with those for any other 
> pending mutations in the current clock tick. If an overlap is identified, 
> getRowSequence will yield and loop until there is no longer an overlap and 
> the caller's pending mutation can succeed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25975) Row commit sequencer

2021-06-11 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell updated HBASE-25975:

Attachment: HBASE-25975-c4cf83ce.pdf

> Row commit sequencer
> 
>
> Key: HBASE-25975
> URL: https://issues.apache.org/jira/browse/HBASE-25975
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
> Attachments: HBASE-25975-c4cf83ce.pdf
>
>
> Use a row commit sequencer in HRegion to ensure that only the operations that 
> mutate disjoint sets of rows are able to commit within the same clock tick. 
> This maintains the invariant that more than one mutation to a given row will 
> never be committed in the same clock tick.
> Callers will first acquire row locks for the row(s) the pending mutation will 
> mutate. Then they will use RowCommitSequencer.getRowSequence to ensure that 
> the set of rows about to be mutated do not overlap with those for any other 
> pending mutations in the current clock tick. If an overlap is identified, 
> getRowSequence will yield and loop until there is no longer an overlap and 
> the caller's pending mutation can succeed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25391) Flush directly into data directory, skip rename when committing flush

2021-06-11 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-25391:
-
Release Note: 
This adds a new StoreFlushContext implementation, PersistedStoreFlushContext, 
together with a new StoreFlusher implementation, PersistedEngineStoreFlusher. 
As the memstore flush process is actually comprised of an initial file 
creations, followed by the actual write and finally a "commit" action, these 
two should be always be used together to guarantee a successful flush that 
doesn't involve creating temp files that get renamed later at the "commit" 
stage. By setting PersistedEngineStoreFlusher as the StoreFlusher 
implementation at "hbase.hstore.defaultengine.storeflusher.class" 
configuration, memstore flushes will create the resulting hfile directly in the 
store dir, instead of using a temp dir. Complementing the flush, 
PersistedStoreFlushContext, configured by 
hbase.regionserver.store.flush.context.class, assumes committed store files 
were written directly in the store dir, and therefore, doesn't perform a rename 
from tmp dir into the store dir.

Note: This requires, specific StoreEngine and StoreFileManager implementations 
capable of tracking committed from non-committed files, like the ones 
implemented by HBASE-25395.

  was:
This adds a new StoreFlushContext implementation, PersistedStoreFlushContext, 
together with a new StoreFlusher implementation, PersistedEngineStoreFlusher. 
As the memstore flush process is actually comprised of an initial file 
creations, followed by the actual write and finally a "commit" action, these 
two should be always be used together to guarantee a successful flush that 
doesn't involve creating temp files that get renamed later at the "commit" 
stage. By setting PersistedEngineStoreFlusher as the StoreFlusher 
implementation at "hbase.hstore.defaultengine.storeflusher.class" 
configuration, memstore flushes will create the resulting hfile directly in the 
store dir, instead of using a temp dir. Complementing the flush, 
PersistedStoreFlushContext, configured by 
hbase.regionserver.store.flush.context.class, assumes committed store files 
were written directly in the store dir, and therefore, doesn't perform a rename 
from tmp dir into the store dir.

Note: This requires 


> Flush directly into data directory, skip rename when committing flush
> -
>
> Key: HBASE-25391
> URL: https://issues.apache.org/jira/browse/HBASE-25391
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Wellington Chevreuil
>Priority: Major
>
> {color:#00}When flushing memstore snapshot to HFile, we write it directly 
> to the data directory.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25391) Flush directly into data directory, skip rename when committing flush

2021-06-11 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-25391:
-
Release Note: 
This adds a new StoreFlushContext implementation, PersistedStoreFlushContext, 
together with a new StoreFlusher implementation, PersistedEngineStoreFlusher. 
As the memstore flush process is actually comprised of an initial file 
creations, followed by the actual write and finally a "commit" action, these 
two should be always be used together to guarantee a successful flush that 
doesn't involve creating temp files that get renamed later at the "commit" 
stage. By setting PersistedEngineStoreFlusher as the StoreFlusher 
implementation at "hbase.hstore.defaultengine.storeflusher.class" 
configuration, memstore flushes will create the resulting hfile directly in the 
store dir, instead of using a temp dir. Complementing the flush, 
PersistedStoreFlushContext, configured by 
hbase.regionserver.store.flush.context.class, assumes committed store files 
were written directly in the store dir, and therefore, doesn't perform a rename 
from tmp dir into the store dir.

Note: This requires 

  was:This adds a new StoreFlushContext implementation, 
PersistedStoreFlushContext, together with a new StoreFlusher implementation, 
PersistedEngineStoreFlusher. As the memstore flush process is actually 
comprised of an initial file creations, followed by the actual write and 
finally a "commit" action, these two should be always be used together to 
guarantee a successful flush that doesn't involve creating temp files that get 
renamed later at the "commit" stage. By setting PersistedEngineStoreFlusher as 
the StoreFlusher implementation at 
"hbase.hstore.defaultengine.storeflusher.class" configuration, memstore flushes 
will create the resulting hfile directly in the store dir, instead of using a 
temp dir. Complementing the flush, PersistedStoreFlushContext, configured by 
hbase.regionserver.store.flush.context.class, assumes committed store files 
were written directly in the store dir, and therefore, doesn't perform a rename 
from tmp dir into the store dir.


> Flush directly into data directory, skip rename when committing flush
> -
>
> Key: HBASE-25391
> URL: https://issues.apache.org/jira/browse/HBASE-25391
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Wellington Chevreuil
>Priority: Major
>
> {color:#00}When flushing memstore snapshot to HFile, we write it directly 
> to the data directory.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25391) Flush directly into data directory, skip rename when committing flush

2021-06-11 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362052#comment-17362052
 ] 

Wellington Chevreuil commented on HBASE-25391:
--

I had added a RN answering [~anoop.hbase] questions. Sorry, didn't think it was 
required for subtasks (or pieces) of a main feature.

> Flush directly into data directory, skip rename when committing flush
> -
>
> Key: HBASE-25391
> URL: https://issues.apache.org/jira/browse/HBASE-25391
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Wellington Chevreuil
>Priority: Major
>
> {color:#00}When flushing memstore snapshot to HFile, we write it directly 
> to the data directory.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25391) Flush directly into data directory, skip rename when committing flush

2021-06-11 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-25391.
--
Release Note: This adds a new StoreFlushContext implementation, 
PersistedStoreFlushContext, together with a new StoreFlusher implementation, 
PersistedEngineStoreFlusher. As the memstore flush process is actually 
comprised of an initial file creations, followed by the actual write and 
finally a "commit" action, these two should be always be used together to 
guarantee a successful flush that doesn't involve creating temp files that get 
renamed later at the "commit" stage. By setting PersistedEngineStoreFlusher as 
the StoreFlusher implementation at 
"hbase.hstore.defaultengine.storeflusher.class" configuration, memstore flushes 
will create the resulting hfile directly in the store dir, instead of using a 
temp dir. Complementing the flush, PersistedStoreFlushContext, configured by 
hbase.regionserver.store.flush.context.class, assumes committed store files 
were written directly in the store dir, and therefore, doesn't perform a rename 
from tmp dir into the store dir.
  Resolution: Fixed

> Flush directly into data directory, skip rename when committing flush
> -
>
> Key: HBASE-25391
> URL: https://issues.apache.org/jira/browse/HBASE-25391
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Wellington Chevreuil
>Priority: Major
>
> {color:#00}When flushing memstore snapshot to HFile, we write it directly 
> to the data directory.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-25391) Flush directly into data directory, skip rename when committing flush

2021-06-11 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil reopened HBASE-25391:
--

> Flush directly into data directory, skip rename when committing flush
> -
>
> Key: HBASE-25391
> URL: https://issues.apache.org/jira/browse/HBASE-25391
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Wellington Chevreuil
>Priority: Major
>
> {color:#00}When flushing memstore snapshot to HFile, we write it directly 
> to the data directory.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-11 Thread Bharath Vissapragada (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Vissapragada updated HBASE-25998:
-
Status: Patch Available  (was: Open)

> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Attachments: monitor-overhead-1.png, monitor-overhead-2.png
>
>
> While working on HBASE-25984, I noticed some weird frames in the flame graphs 
> around monitor entry exit consuming a lot of CPU cycles (see attached 
> images). Noticed that the synchronization there is too coarse grained and 
> sometimes unnecessary. I did a simple patch that switched to a reentrant lock 
> based synchronization with condition variable rather than a busy wait and 
> that showed 70-80% increased throughput in WAL PE. Seems too good to be 
> true.. (more details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-11 Thread Bharath Vissapragada (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361973#comment-17361973
 ] 

Bharath Vissapragada commented on HBASE-25998:
--

Redid the experiments with JDK-11 (to account for any latest monitor 
performance enhancements) and I see similar numbers. Also, the numbers above 
are for {{-t 256}} which implies heavy contention. It seems like the patch 
performs well under heavy load and the gap narrows with fewer threads (which I 
guess is expected), but even with very low concurrency the patch seems to out 
perform the current state.

> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Attachments: monitor-overhead-1.png, monitor-overhead-2.png
>
>
> While working on HBASE-25984, I noticed some weird frames in the flame graphs 
> around monitor entry exit consuming a lot of CPU cycles (see attached 
> images). Noticed that the synchronization there is too coarse grained and 
> sometimes unnecessary. I did a simple patch that switched to a reentrant lock 
> based synchronization with condition variable rather than a busy wait and 
> that showed 70-80% increased throughput in WAL PE. Seems too good to be 
> true.. (more details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25993) Make excluded SSL cipher suites configurable for all Web UIs

2021-06-11 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361940#comment-17361940
 ] 

Hudson commented on HBASE-25993:


Results for branch branch-2
[build #275 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/275/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/275/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/275/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/275/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/275/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Make excluded SSL cipher suites configurable for all Web UIs
> 
>
> Key: HBASE-25993
> URL: https://issues.apache.org/jira/browse/HBASE-25993
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-1, 2.2.7, 2.5.0, 2.3.5, 2.4.4
>Reporter: Mate Szalay-Beko
>Assignee: Mate Szalay-Beko
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5
>
>
> When starting a jetty http server, one can explicitly exclude certain 
> (unsecure) SSL cipher suites. This can be especially important, when the 
> HBase cluster needs to be compliant with security regulations (e.g. FIPS).
> Currently it is possible to set the excluded ciphers for the ThriftServer 
> ("hbase.thrift.ssl.exclude.cipher.suites") or for the RestServer 
> ("hbase.rest.ssl.exclude.cipher.suites"), but one can not configure it for 
> the regular InfoServer started by e.g. the master or region servers.
> In this commit I want to introduce a new configuration 
> "ssl.server.exclude.cipher.list" to configure the excluded cipher suites for 
> the http server started by the InfoServer. This parameter has the same name 
> and will work in the same way, as it was already implemented in hadoop (e.g. 
> for hdfs/yarn). See: HADOOP-12668, HADOOP-14341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25947) Backport 'HBASE-25894 Improve the performance for region load and region count related cost functions' to branch-2.4 and branch-2.3

2021-06-11 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361913#comment-17361913
 ] 

Hudson commented on HBASE-25947:


Results for branch branch-2.3
[build #236 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Backport 'HBASE-25894 Improve the performance for region load and region 
> count related cost functions' to branch-2.4 and branch-2.3
> ---
>
> Key: HBASE-25947
> URL: https://issues.apache.org/jira/browse/HBASE-25947
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer, Performance
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.3.6, 2.4.5
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25894) Improve the performance for region load and region count related cost functions

2021-06-11 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361914#comment-17361914
 ] 

Hudson commented on HBASE-25894:


Results for branch branch-2.3
[build #236 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/236/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Improve the performance for region load and region count related cost 
> functions
> ---
>
> Key: HBASE-25894
> URL: https://issues.apache.org/jira/browse/HBASE-25894
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer, Performance
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> For a large cluster, we have a lot of regions, so computing the whole cost 
> will be expensive. We should try to remove the unnecessary calculation as 
> much as possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-11 Thread Bharath Vissapragada (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361867#comment-17361867
 ] 

Bharath Vissapragada commented on HBASE-25998:
--

[~zhangduo] [~apurtell] [~stack] might of interest to you (draft patch up for 
review), results seem too good to be true. If you don't mind trying the patch 
locally in your environment (just want to eliminate any noise from my end)..  
PTAL.

> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Attachments: monitor-overhead-1.png, monitor-overhead-2.png
>
>
> While working on HBASE-25984, I noticed some weird frames in the flame graphs 
> around monitor entry exit consuming a lot of CPU cycles (see attached 
> images). Noticed that the synchronization there is too coarse grained and 
> sometimes unnecessary. I did a simple patch that switched to a reentrant lock 
> based synchronization with condition variable rather than a busy wait and 
> that showed 70-80% increased throughput in WAL PE. Seems too good to be 
> true.. (more details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-11 Thread Bharath Vissapragada (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361864#comment-17361864
 ] 

Bharath Vissapragada commented on HBASE-25998:
--

{noformat}
java -version
java version "1.8.0_221"
Java(TM) SE Runtime Environment (build 1.8.0_221-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)
{noformat}


For default WAL provider (async WAL) 

Without Patch

{noformat}
-- Histograms --
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.latencyHistogram.nanos
 count = 10271257
   min = 2672827
   max = 67700701
  mean = 4084532.41
stddev = 6244597.80
median = 3403047.00
  75% <= 3525394.00
  95% <= 3849268.00
  98% <= 4319378.00
  99% <= 61134500.00
99.9% <= 67195663.00
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncCountHistogram.countPerSync
 count = 100888
   min = 52
   max = 103
  mean = 101.91
stddev = 2.09
median = 102.00
  75% <= 102.00
  95% <= 102.00
  98% <= 102.00
  99% <= 103.00
99.9% <= 103.00
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncHistogram.nanos-between-syncs
 count = 100889
   min = 119051
   max = 62778058
  mean = 1601305.10
stddev = 3626948.72
median = 1361530.00
  75% <= 1407052.00
  95% <= 1523418.00
  98% <= 1765310.00
  99% <= 2839178.00
99.9% <= 62778058.00

-- Meters --
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.appendMeter.bytes
 count = 5721241096
 mean rate = 37890589.06 events/second
 1-minute rate = 36390169.75 events/second
 5-minute rate = 33524039.88 events/second
15-minute rate = 31915066.49 events/second
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncMeter.syncs
 count = 100889
 mean rate = 668.16 events/second
 1-minute rate = 641.77 events/second
 5-minute rate = 590.37 events/second
15-minute rate = 561.67 events/second
{noformat}

With patch:

{noformat}
-- Histograms --
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.latencyHistogram.nanos
 count = 12927042
   min = 943723
   max = 60827209
  mean = 1865217.32
stddev = 5384907.53
median = 1323691.00
  75% <= 1443195.00
  95% <= 1765866.00
  98% <= 1921920.00
  99% <= 3144643.00
99.9% <= 60827209.00
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncCountHistogram.countPerSync
 count = 126797
   min = 52
   max = 104
  mean = 101.87
stddev = 2.54
median = 102.00
  75% <= 102.00
  95% <= 102.00
  98% <= 103.00
  99% <= 103.00
99.9% <= 103.00
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncHistogram.nanos-between-syncs
 count = 126798
   min = 122666
   max = 60703608
  mean = 711847.31
stddev = 3174375.63
median = 519092.00
  75% <= 570240.00
  95% <= 695175.00
  98% <= 754972.00
  99% <= 791139.00
99.9% <= 59975393.00

-- Meters --
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.appendMeter.bytes
 count = 7200681555
 mean rate = 79170095.16 events/second
 1-minute rate = 75109969.27 events/second
 5-minute rate = 66505621.40 events/second
15-minute rate = 63719949.74 events/second
org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncMeter.syncs
 count = 126800
 mean rate = 1394.11 events/second
 1-minute rate = 1322.31 events/second
 5-minute rate = 1169.99 events/second
15-minute rate = 1120.69 events/second
{noformat}



> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Attachments: monitor-overhead-1.png, 

[jira] [Updated] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-11 Thread Bharath Vissapragada (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Vissapragada updated HBASE-25998:
-
Attachment: monitor-overhead-2.png
monitor-overhead-1.png

> Revisit synchronization in SyncFuture
> -
>
> Key: HBASE-25998
> URL: https://issues.apache.org/jira/browse/HBASE-25998
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Major
> Attachments: monitor-overhead-1.png, monitor-overhead-2.png
>
>
> While working on HBASE-25984, I noticed some weird frames in the flame graphs 
> around monitor entry exit consuming a lot of CPU cycles (see attached 
> images). Noticed that the synchronization there is too coarse grained and 
> sometimes unnecessary. I did a simple patch that switched to a reentrant lock 
> based synchronization with condition variable rather than a busy wait and 
> that showed 70-80% increased throughput in WAL PE. Seems too good to be 
> true.. (more details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25998) Revisit synchronization in SyncFuture

2021-06-11 Thread Bharath Vissapragada (Jira)
Bharath Vissapragada created HBASE-25998:


 Summary: Revisit synchronization in SyncFuture
 Key: HBASE-25998
 URL: https://issues.apache.org/jira/browse/HBASE-25998
 Project: HBase
  Issue Type: Improvement
  Components: Performance, regionserver, wal
Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0
Reporter: Bharath Vissapragada
Assignee: Bharath Vissapragada


While working on HBASE-25984, I noticed some weird frames in the flame graphs 
around monitor entry exit consuming a lot of CPU cycles (see attached images). 
Noticed that the synchronization there is too coarse grained and sometimes 
unnecessary. I did a simple patch that switched to a reentrant lock based 
synchronization with condition variable rather than a busy wait and that showed 
70-80% increased throughput in WAL PE. Seems too good to be true.. (more 
details in the comments).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25993) Make excluded SSL cipher suites configurable for all Web UIs

2021-06-11 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361776#comment-17361776
 ] 

Hudson commented on HBASE-25993:


Results for branch master
[build #321 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/321/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/321/General_20Nightly_20Build_20Report/]






(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/321/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/321/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Make excluded SSL cipher suites configurable for all Web UIs
> 
>
> Key: HBASE-25993
> URL: https://issues.apache.org/jira/browse/HBASE-25993
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-1, 2.2.7, 2.5.0, 2.3.5, 2.4.4
>Reporter: Mate Szalay-Beko
>Assignee: Mate Szalay-Beko
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5
>
>
> When starting a jetty http server, one can explicitly exclude certain 
> (unsecure) SSL cipher suites. This can be especially important, when the 
> HBase cluster needs to be compliant with security regulations (e.g. FIPS).
> Currently it is possible to set the excluded ciphers for the ThriftServer 
> ("hbase.thrift.ssl.exclude.cipher.suites") or for the RestServer 
> ("hbase.rest.ssl.exclude.cipher.suites"), but one can not configure it for 
> the regular InfoServer started by e.g. the master or region servers.
> In this commit I want to introduce a new configuration 
> "ssl.server.exclude.cipher.list" to configure the excluded cipher suites for 
> the http server started by the InfoServer. This parameter has the same name 
> and will work in the same way, as it was already implemented in hadoop (e.g. 
> for hdfs/yarn). See: HADOOP-12668, HADOOP-14341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25991) Do compaction on compaction server

2021-06-11 Thread Yulin Niu (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yulin Niu updated HBASE-25991:
--
Summary: Do compaction on compaction server  (was: do compaction on 
compaction server)

> Do compaction on compaction server
> --
>
> Key: HBASE-25991
> URL: https://issues.apache.org/jira/browse/HBASE-25991
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Yulin Niu
>Assignee: Yulin Niu
>Priority: Major
>
> after HBASE-25968 , this patch implement code in compaction server 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25997) NettyRpcFrameDecoder decode request header wrong when handleTooBigRequest

2021-06-11 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-25997:
--
Component/s: rpc

> NettyRpcFrameDecoder decode request header wrong  when handleTooBigRequest
> --
>
> Key: HBASE-25997
> URL: https://issues.apache.org/jira/browse/HBASE-25997
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Reporter: Lijin Bin
>Priority: Major
>
> Client write a big request to server, server decode request wrong, so client 
> do not get a RequestTooBigException as expected.
> {code}
> 2021-06-11 18:57:27,340 INFO  [RS-EventLoopGroup-1-20] ipc.NettyRpcServer: 
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException:
>  Protocol message tag had invalid wire type.
>   at 
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:111)
>   at 
> org.apache.hbase.thirdparty.com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:519)
>   at 
> org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessageV3.parseUnknownField(GeneratedMessageV3.java:298)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader.(RPCProtos.java:5958)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader.(RPCProtos.java:5916)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$1.parsePartialFrom(RPCProtos.java:7249)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$1.parsePartialFrom(RPCProtos.java:7244)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$Builder.mergeFrom(RPCProtos.java:6679)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$Builder.mergeFrom(RPCProtos.java:6482)
>   at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:420)
>   at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:317)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.mergeFrom(ProtobufUtil.java:2716)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcFrameDecoder.getHeader(NettyRpcFrameDecoder.java:174)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcFrameDecoder.handleTooBigRequest(NettyRpcFrameDecoder.java:126)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcFrameDecoder.decode(NettyRpcFrameDecoder.java:65)
>   at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502)
>   at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441)
>   at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:405)
>   at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:372)
>   at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:242)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:228)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:221)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1403)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:242)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:228)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:827)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
>   at 
> 

[jira] [Created] (HBASE-25997) NettyRpcFrameDecoder decode request header wrong when handleTooBigRequest

2021-06-11 Thread Lijin Bin (Jira)
Lijin Bin created HBASE-25997:
-

 Summary: NettyRpcFrameDecoder decode request header wrong  when 
handleTooBigRequest
 Key: HBASE-25997
 URL: https://issues.apache.org/jira/browse/HBASE-25997
 Project: HBase
  Issue Type: Bug
Reporter: Lijin Bin


Client write a big request to server, server decode request wrong, so client do 
not get a RequestTooBigException as expected.
{code}
2021-06-11 18:57:27,340 INFO  [RS-EventLoopGroup-1-20] ipc.NettyRpcServer: 
org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException:
 Protocol message tag had invalid wire type.
at 
org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:111)
at 
org.apache.hbase.thirdparty.com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:519)
at 
org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessageV3.parseUnknownField(GeneratedMessageV3.java:298)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader.(RPCProtos.java:5958)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader.(RPCProtos.java:5916)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$1.parsePartialFrom(RPCProtos.java:7249)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$1.parsePartialFrom(RPCProtos.java:7244)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$Builder.mergeFrom(RPCProtos.java:6679)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RPCProtos$RequestHeader$Builder.mergeFrom(RPCProtos.java:6482)
at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:420)
at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:317)
at 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.mergeFrom(ProtobufUtil.java:2716)
at 
org.apache.hadoop.hbase.ipc.NettyRpcFrameDecoder.getHeader(NettyRpcFrameDecoder.java:174)
at 
org.apache.hadoop.hbase.ipc.NettyRpcFrameDecoder.handleTooBigRequest(NettyRpcFrameDecoder.java:126)
at 
org.apache.hadoop.hbase.ipc.NettyRpcFrameDecoder.decode(NettyRpcFrameDecoder.java:65)
at 
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502)
at 
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441)
at 
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:405)
at 
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:372)
at 
org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355)
at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:242)
at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:228)
at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:221)
at 
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1403)
at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:242)
at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:228)
at 
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912)
at 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:827)
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at 
org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:495)
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
{code}



--

[jira] [Updated] (HBASE-25996) add hbase hbck result on jmx

2021-06-11 Thread xijiawen (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xijiawen updated HBASE-25996:
-
Description: https://github.com/apache/hbase/pull/3379

> add hbase hbck result on jmx
> 
>
> Key: HBASE-25996
> URL: https://issues.apache.org/jira/browse/HBASE-25996
> Project: HBase
>  Issue Type: Improvement
>Reporter: xijiawen
>Assignee: xijiawen
>Priority: Major
>
> https://github.com/apache/hbase/pull/3379



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25996) add hbase hbck result on jmx

2021-06-11 Thread xijiawen (Jira)
xijiawen created HBASE-25996:


 Summary: add hbase hbck result on jmx
 Key: HBASE-25996
 URL: https://issues.apache.org/jira/browse/HBASE-25996
 Project: HBase
  Issue Type: Improvement
Reporter: xijiawen
Assignee: xijiawen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)