[ 
https://issues.apache.org/jira/browse/HBASE-24564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136615#comment-17136615
 ] 

HBase QA commented on HBASE-24564:
----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  7m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
16s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-1 passed with JDK v1.8.0_252 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} branch-1 passed with JDK v1.7.0_262 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 2s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} branch-1 passed with JDK v1.8.0_252 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} branch-1 passed with JDK v1.7.0_262 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m  
3s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
59s{color} | {color:green} branch-1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_252 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed with JDK v1.7.0_262 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
36s{color} | {color:red} hbase-server: The patch generated 3 new + 291 
unchanged - 1 fixed = 294 total (was 292) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
52s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
4m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 2.8.5 
2.9.2. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed with JDK v1.8.0_252 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed with JDK v1.7.0_262 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}125m 
34s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.11 Server=19.03.11 base: 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1911/1/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hbase/pull/1911 |
| JIRA Issue | HBASE-24564 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
shadedjars hadoopcheck hbaseanti checkstyle compile |
| uname | Linux f8c82b48faa5 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/Base-PreCommit-GitHub-PR_PR-1911/out/precommit/personality/provided.sh
 |
| git revision | branch-1 / 5d64f06 |
| Default Java | 1.7.0_262 |
| Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:1.8.0_252 
/usr/lib/jvm/zulu-7-amd64:1.7.0_262 |
| checkstyle | 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1911/1/artifact/out/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1911/1/testReport/
 |
| Max. process+thread count | 4111 (vs. ulimit of 10000) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1911/1/console |
| versions | git=1.9.1 maven=3.0.5 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |


This message was automatically generated.



> Make RS abort call idempotent
> -----------------------------
>
>                 Key: HBASE-24564
>                 URL: https://issues.apache.org/jira/browse/HBASE-24564
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>            Reporter: Bharath Vissapragada
>            Assignee: Bharath Vissapragada
>            Priority: Major
>
> We noticed this in our deployment based on branch-1, but it affects other 
> branches too.
> 1. abort() is not idempotent. There can be multiple aborts that can 
> un-necessarily complicate the state machine. Following is the timeline of 
> actions.
> - HMaster detected that the RS lost its ZK session and started the SCP. This 
> was caused by ZK flakiness
> {noformat}
> 2020-06-11 01:08:39,110 DEBUG [ProcedureExecutor-34] master.DeadServer - 
> Started processing foo,60020,1591683150711; numProcessing=2
> 2020-06-11 01:08:39,110 INFO [ProcedureExecutor-34] 
> procedure.ServerCrashProcedure - Start processing crashed 
> foo,60020,1591683150711
> {noformat}
> - RS wakes up and attempts to report to master and receives a YouAreDead... 
> This triggers an abort
> {noformat}
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.YouAreDeadException):
>  org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing foo,60020,1591683150711 as dead server
>         at 
> org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:438)
>         at 
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:343)
>         at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:359)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8617)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2421)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291)
>         at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
>         at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:94)
>         at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:413)
>         at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:409)
>         at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
>         at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
>         at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection.readResponse(BlockingRpcConnection.java:600)
>         at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection.run(BlockingRpcConnection.java:334)
>         ... 1 more
> {noformat}
> - After a few seconds, RS also realizes that it lost the ZK session and 
> initiates a second abort
> {noformat}
> 2020-06-11 01:08:50,321 FATAL [main-EventThread] regionserver.HRegionServer - 
> ABORTING region server foo,60020,1591683150711: 
> regionserver:60020-0x1725cd18ff3c55f, quorum=foo:2181,bar:2181,baz:2181 
> baseZNode=/hbase regionserver:60020-0x1725cd18ff3c55f received expired from 
> ZooKeeper, aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:697)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:629)
>         at 
> org.apache.hadoop.hbase.zookeeper.PendingWatcher.process(PendingWatcher.java:40)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:544)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:519)
> {noformat}
> Overall, there were two sequences of aborts running at the same time. This 
> can be avoided by making abort idempotent.
> -2.  Abort timeout task doesn't init as expected.- (edited, see comments)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to