[ 
https://issues.apache.org/jira/browse/HBASE-20038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371586#comment-16371586
 ] 

Hadoop QA commented on HBASE-20038:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
56s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
45s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 32s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 37s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}151m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.replication.TestReplicationDroppedTables |
|   | hadoop.hbase.replication.TestNamespaceReplication |
|   | hadoop.hbase.replication.TestReplicationKillMasterRSCompressed |
|   | hadoop.hbase.replication.TestReplicationChangingPeerRegionservers |
|   | hadoop.hbase.replication.TestMasterReplication |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-20038 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12911385/HBASE-20038.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2258fc19e0de 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 92d04d5751 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11598/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11598/testReport/ |
| Max. process+thread count | 4930 (vs. ulimit of 10000) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11598/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> TestLockProcedure.testTimeout is flakey
> ---------------------------------------
>
>                 Key: HBASE-20038
>                 URL: https://issues.apache.org/jira/browse/HBASE-20038
>             Project: HBase
>          Issue Type: Sub-task
>          Components: proc-v2, test
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>
>         Attachments: HBASE-20038.patch
>
>
> The test is simple so it is easy to find out the problem.
> {noformat}
> 2018-02-21 04:53:32,230 INFO  [Time-limited test] hbase.ResourceChecker(148): 
> before: master.locking.TestLockProcedure#testTimeout Thread=218, 
> OpenFileDescriptor=853, MaxFileDescriptor=60000, SystemLoadAverage=5075, 
> ProcessCount=312, AvailableMemoryMB=5373
> 2018-02-21 04:53:32,234 WARN  [Time-limited test] 
> procedure2.ProcedureTestingUtility(146): Set Kill before store update to: 
> false
> 2018-02-21 04:53:32,278 DEBUG [Time-limited test] 
> procedure2.ProcedureExecutor(866): Stored pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE
> 2018-02-21 04:53:32,285 DEBUG [Time-limited test] locking.LockProcedure(195): 
> Heartbeat pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE
> 2018-02-21 04:53:32,286 DEBUG [PEWorker-1] locking.LockProcedure(312): LOCKED 
> pid=14, state=RUNNABLE; org.apache.hadoop.hbase.master.locking.LockProcedure, 
> namespace=namespace, type=EXCLUSIVE
> 2018-02-21 04:53:32,303 DEBUG [Time-limited test] 
> locking.TestLockProcedure(204): Proc id 14 acquired lock.
> 2018-02-21 04:53:32,298 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE; timeout=2000, timestamp=1519188814298
> 2018-02-21 04:53:33,303 DEBUG [Time-limited test] locking.LockProcedure(195): 
> Heartbeat pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE
> 2018-02-21 04:53:33,304 DEBUG [Time-limited test] 
> locking.TestLockProcedure(225): Proc id 14 : LOCKED.
> 2018-02-21 04:53:34,299 DEBUG [ProcExecTimeout] locking.LockProcedure(207): 
> Timeout failure ProcedureEvent for pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE, ready=false, [pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE]
> 2018-02-21 04:53:34,299 DEBUG [ProcExecTimeout] locking.LockProcedure(210): 
> Calling wake on ProcedureEvent for pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE, ready=false, [pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE]
> 2018-02-21 04:53:34,299 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE; timeout=2000, timestamp=1519188816299
> 2018-02-21 04:53:34,306 DEBUG [Time-limited test] locking.LockProcedure(195): 
> Heartbeat pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE
> 2018-02-21 04:53:34,306 DEBUG [Time-limited test] 
> locking.TestLockProcedure(225): Proc id 14 : LOCKED.
> 2018-02-21 04:53:34,766 WARN  [HBase-Metrics2-1] impl.MetricsConfig(125): 
> Cannot locate configuration: tried 
> hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
> 2018-02-21 04:53:36,299 DEBUG [ProcExecTimeout] locking.LockProcedure(207): 
> Timeout failure ProcedureEvent for pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE, ready=false, [pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE]
> 2018-02-21 04:53:36,299 DEBUG [ProcExecTimeout] locking.LockProcedure(210): 
> Calling wake on ProcedureEvent for pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE, ready=false, [pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE]
> 2018-02-21 04:53:36,299 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE; timeout=2000, timestamp=1519188818299
> 2018-02-21 04:53:38,299 DEBUG [ProcExecTimeout] locking.LockProcedure(207): 
> Timeout failure ProcedureEvent for pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE, ready=false, [pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE]
> 2018-02-21 04:53:38,299 DEBUG [ProcExecTimeout] locking.LockProcedure(210): 
> Calling wake on ProcedureEvent for pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE, ready=false, [pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE]
> 2018-02-21 04:53:38,310 DEBUG [Time-limited test] locking.LockProcedure(195): 
> Heartbeat pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE
> 2018-02-21 04:53:38,310 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=14, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE; timeout=2000, timestamp=1519188820310
> 2018-02-21 04:53:38,311 WARN  [Time-limited test] 
> procedure2.ProcedureTestingUtility(146): Set Kill before store update to: 
> false
> 2018-02-21 04:53:38,342 DEBUG [PEWorker-1] locking.LockProcedure(242): 
> UNLOCKED pid=14, state=RUNNABLE; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE
> 2018-02-21 04:53:38,366 INFO  [PEWorker-1] 
> procedure2.ProcedureExecutor(1247): Finished pid=14, state=SUCCESS; 
> org.apache.hadoop.hbase.master.locking.LockProcedure, namespace=namespace, 
> type=EXCLUSIVE in 6.0920sec
> 2018-02-21 04:53:38,628 INFO  [Time-limited test] hbase.ResourceChecker(172): 
> after: master.locking.TestLockProcedure#testTimeout Thread=209 (was 218), 
> OpenFileDescriptor=851 (was 853), MaxFileDescriptor=60000 (was 60000), 
> SystemLoadAverage=5045 (was 5075), ProcessCount=314 (was 312) - ProcessCount 
> LEAK? -, AvailableMemoryMB=5071 (was 5373)
> {noformat}
> The problem is that, sleep 2 * HEARTBEAT_TIMEOUT is not safe. It is possible 
> that after the first TIMEOUT the procedure has not been timeout yet and then 
> the check for unlocked and the actual timeout processing will arrive at 
> almost the same time. You can see the above log, we are going to mark the 
> procedure as timeout at 04:53:38,299 and the heartbeat for checking unlock 
> arrives at 04:53:38,310.
> I think increase the wait time can solve the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to