[
https://issues.apache.org/jira/browse/HBASE-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826579#comment-16826579
]
HBase QA commented on HBASE-22289:
----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
52s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}
0m 0s{color} | {color:orange} The patch doesn't appear to include any new or
modified tests. Please justify why no new tests are needed for this patch. Also
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m
39s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
57s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m
23s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m
52s{color} | {color:green} branch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m
31s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
38s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m
18s{color} | {color:red} hbase-server: The patch generated 1 new + 15 unchanged
- 14 fixed = 16 total (was 29) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m
52s{color} | {color:green} patch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}
10m 8s{color} | {color:green} Patch does not cause any errors with Hadoop
2.7.4 or 3.0.0. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m
42s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed =
1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}179m 18s{color}
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
28s{color} | {color:green} The patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}222m 56s{color} |
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
| | Switch statement found in
org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process() where
one case falls through to the next case At WALSplitterHandler.java:where one
case falls through to the next case At WALSplitterHandler.java:[lines 84-87] |
| Failed junit tests | hadoop.hbase.quotas.TestSpaceQuotas |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base:
https://builds.apache.org/job/PreCommit-HBASE-Build/190/artifact/patchprocess/Dockerfile
|
| JIRA Issue | HBASE-22289 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12967068/HBASE-22289.03-branch-2.1.patch
|
| Optional Tests | dupname asflicense javac javadoc unit findbugs
shadedjars hadoopcheck hbaseanti checkstyle compile |
| uname | Linux 7507ee5a95f4 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb
13 15:00:41 UTC 2019 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | branch-2.1 / c7a70dfaba |
| maven | version: Apache Maven 3.5.4
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.11 |
| checkstyle |
https://builds.apache.org/job/PreCommit-HBASE-Build/190/artifact/patchprocess/diff-checkstyle-hbase-server.txt
|
| findbugs |
https://builds.apache.org/job/PreCommit-HBASE-Build/190/artifact/patchprocess/new-findbugs-hbase-server.html
|
| unit |
https://builds.apache.org/job/PreCommit-HBASE-Build/190/artifact/patchprocess/patch-unit-hbase-server.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HBASE-Build/190/testReport/ |
| Max. process+thread count | 5263 (vs. ulimit of 10000) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://builds.apache.org/job/PreCommit-HBASE-Build/190/console |
| Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
This message was automatically generated.
> WAL-based log splitting resubmit threshold may result in a task being stuck
> forever
> -----------------------------------------------------------------------------------
>
> Key: HBASE-22289
> URL: https://issues.apache.org/jira/browse/HBASE-22289
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.1.0, 1.5.0
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Priority: Major
> Fix For: 2.1.5
>
> Attachments: HBASE-22289.01-branch-2.1.patch,
> HBASE-22289.02-branch-2.1.patch, HBASE-22289.03-branch-2.1.patch
>
>
> Not sure if this is handled better in procedure based WAL splitting; in any
> case it affects versions before that.
> The problem is not in ZK as such but in internal state tracking in master, it
> seems.
> Master:
> {noformat}
> 2019-04-21 01:49:49,584 INFO
> [master/<master>:17000.splitLogManager..Chore.1]
> coordination.SplitLogManagerCoordination: Resubmitting task
> <path>.1555831286638
> {noformat}
> worker-rs, split fails
> {noformat}
> ....
> 2019-04-21 02:05:31,774 INFO
> [RS_LOG_REPLAY_OPS-regionserver/<worker-rs>:17020-1] wal.WALSplitter:
> Processed 24 edits across 2 regions; edits skipped=457; log
> file=<path>.1555831286638, length=2156363702, corrupted=false, progress
> failed=true
> {noformat}
> Master (not sure about the delay of the acquired-message; at any rate it
> seems to detect the failure fine from this server)
> {noformat}
> 2019-04-21 02:11:14,928 INFO [main-EventThread]
> coordination.SplitLogManagerCoordination: Task <path>.1555831286638 acquired
> by <worker-rs>,17020,1555539815097
> 2019-04-21 02:19:41,264 INFO
> [master/<master>:17000.splitLogManager..Chore.1]
> coordination.SplitLogManagerCoordination: Skipping resubmissions of task
> <path>.1555831286638 because threshold 3 reached
> {noformat}
> After that this task is stuck in the limbo forever with the old worker, and
> never resubmitted.
> RS never logs anything else for this task.
> Killing the RS on the worker unblocked the task and some other server did the
> split very quickly, so seems like master doesn't clear the worker name in its
> internal state when hitting the threshold... master never restarted so
> restarting the master might have also cleared it.
> This is extracted from splitlogmanager log messages, note the times.
> {noformat}
> 2019-04-21 02:2 1555831286638=last_update = 1555837874928 last_version = 11
> cur_worker_name = <worker-rs>,17020,1555539815097 status = in_progress
> incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20,
> ....
> 2019-04-22 11:1 1555831286638=last_update = 1555837874928 last_version = 11
> cur_worker_name = <worker-rs>,17020,1555539815097 status = in_progress
> incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20}
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)