[
https://issues.apache.org/jira/browse/HBASE-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17710393#comment-17710393
]
Hudson commented on HBASE-27768:
--------------------------------
Results for branch branch-2.4
[build #546 on
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/]:
(/) *{color:green}+1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/General_20Nightly_20Build_20Report/]
(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2)
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]
(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]
(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/546/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> Race conditions in BlockingRpcConnection
> ----------------------------------------
>
> Key: HBASE-27768
> URL: https://issues.apache.org/jira/browse/HBASE-27768
> Project: HBase
> Issue Type: Bug
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
> Labels: patch-available
> Fix For: 2.6.0, 2.5.5, 2.4.18
>
>
> We've been experiencing strange timeouts since upgrading to hbase2 client. We
> use BlockingRpcConnection for now until we migrate our auth stack to native
> TLS. In diagnosing the timeouts, I noticed a few issues in this class:
> # Most importantly, there is a race condition which can result in a case
> where a BlockingRpcConnection instance has 2 reader threads running. In this
> case, both are competing for the socket and it causes weird timeouts and in
> some cases corrupted response (i.e. InvalidProtocolBufferException)
> # The waitForWork loop does not properly handle interruption. When it gets
> interrupted, if the above race condition occurs, the waitForWork loop ends up
> forever being in a tight loop. The "wait()" call instantly throws
> InterruptedException, and we set interrupted state back and restart the loop.
> So no waiting is occurring anymore.
> The race condition is somewhat rare, only occurring in certain failure
> scenarios on our highest volume clients. But when it happens, a low level of
> errors will forever be thrown for the affected server connection until the
> client is bounced.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)