[jira] [Commented] (HBASE-27768) Race conditions in BlockingRpcConnection

Hudson (Jira) Mon, 10 Apr 2023 22:17:06 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17710393#comment-17710393
 ]


Hudson commented on HBASE-27768:
--------------------------------

Results for branch branch-2.4
        [build #546 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/]:
 (/) *{color:green}+1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Race conditions in BlockingRpcConnection
> ----------------------------------------
>
>                 Key: HBASE-27768
>                 URL: https://issues.apache.org/jira/browse/HBASE-27768
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>              Labels: patch-available
>             Fix For: 2.6.0, 2.5.5, 2.4.18
>
>
> We've been experiencing strange timeouts since upgrading to hbase2 client. We 
> use BlockingRpcConnection for now until we migrate our auth stack to native 
> TLS. In diagnosing the timeouts, I noticed a few issues in this class:
>  # Most importantly, there is a race condition which can result in a case 
> where a BlockingRpcConnection instance has 2 reader threads running. In this 
> case, both are competing for the socket and it causes weird timeouts and in 
> some cases corrupted response (i.e. InvalidProtocolBufferException)
>  # The waitForWork loop does not properly handle interruption. When it gets 
> interrupted, if the above race condition occurs, the waitForWork loop ends up 
> forever being in a tight loop. The "wait()" call instantly throws 
> InterruptedException, and we set interrupted state back and restart the loop. 
> So no waiting is occurring anymore.
> The race condition is somewhat rare, only occurring in certain failure 
> scenarios on our highest volume clients. But when it happens, a low level of 
> errors will forever be thrown for the affected server connection until the 
> client is bounced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27768) Race conditions in BlockingRpcConnection

Reply via email to