[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109262#comment-16109262 ] Andrew Purtell commented on HBASE-18248: Test failures are not related. > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0 > > Attachments: HBASE-18248-branch-1.patch, HBASE-18248-branch-1.patch, > HBASE-18248.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108323#comment-16108323 ] Hadoop QA commented on HBASE-18248: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 58s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} branch-1 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} branch-1 passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} branch-1 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} branch-1 passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 28s{color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 0s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}128m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestSCVFWithMiniCluster | | | hadoop.hbase.client.TestClientScannerRPCTimeout | | | hadoop.hbase.regionserver.TestRSKilledWhenInitializing | | | hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL | | | hadoop.hbase.TestZooKeeper | | Timed out junit tests | org.apache.hadoop.hbase.security.access.TestCellACLs | | | org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:6f1cc2c | | JIRA Issue | HBASE-18248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879757/HBASE-18248-branch-1.patch
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093367#comment-16093367 ] Andrew Purtell commented on HBASE-18248: Ok, I will make the following improvements and come back with an updated patch, hopefully this week: * Add a default of 0 that means never warn * Make a new subclass for compact+split tasks like there is for RPC tasks and separate configuration options for them > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.3.patch, HBASE-18248-branch-1.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, > HBASE-18248-branch-2.patch, HBASE-18248.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075139#comment-16075139 ] Andrew Purtell commented on HBASE-18248: HadoopQA isn't running > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.3.patch, HBASE-18248-branch-1.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, > HBASE-18248-branch-2.patch, HBASE-18248.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073938#comment-16073938 ] Andrew Purtell commented on HBASE-18248: There is a separate setting for RPC tasks because that's what we really care about and want to dump debug info into the logs about if they get stuck. Honestly I don't care about other task types. We could restrict this change to them. I can add a default of 0 or -1 that says never warn so its completely optional. Another option is make a new subclass for compaction tasks like there is for RPC tasks and separate configuration options for them. > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.3.patch, HBASE-18248-branch-1.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, > HBASE-18248-branch-2.patch, HBASE-18248.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073183#comment-16073183 ] Allan Yang commented on HBASE-18248: I have one question, how do you define warn time for each monitored task? For big compactions, it will last for several hours. But for tasks like opening region , it is not normal if it lasts more than several minutes. Another suggest is that can we reuse the toMap() method in MonitoredRPCHandlerImpl when enriching its toString() method? > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.3.patch, HBASE-18248-branch-1.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, > HBASE-18248-branch-2.patch, HBASE-18248.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072984#comment-16072984 ] Andrew Purtell commented on HBASE-18248: Any concerns about the updated patches? Let me kick precommit again. > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.3.patch, HBASE-18248-branch-1.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, > HBASE-18248-branch-2.patch, HBASE-18248.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059928#comment-16059928 ] Andrew Purtell commented on HBASE-18248: Sure, that's fine. I'm off for a few days. Back with an updated patch next week. > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058697#comment-16058697 ] Allan Yang commented on HBASE-18248: Just a minor suggest, why do we need to pass a conf every time? {code} /** * Get singleton instance. * TODO this would be better off scoped to a single daemon */ - public static synchronized TaskMonitor get() { + public static synchronized TaskMonitor get(Configuration conf) { if (instance == null) { - instance = new TaskMonitor(); + instance = new TaskMonitor(conf); } return instance; } {code} TaskMonitor is a singleton, can we consider create a configuration when creating TaskMonitor, so we don't need to pass a conf every time. {code} TaskMonitor() { Configuration conf = HBaseConfiguration.create(); .. } {code} > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057935#comment-16057935 ] Andrew Purtell commented on HBASE-18248: Test failure not related to patch. Findbugs issues not introduced by this patch. > Warn if monitored task has been tied up beyond a configurable threshold > --- > > Key: HBASE-18248 > URL: https://issues.apache.org/jira/browse/HBASE-18248 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18248-branch-1.3.patch, > HBASE-18248-branch-1.patch, HBASE-18248-branch-2.patch, HBASE-18248.patch > > > Warn if monitored task has been tied up beyond a configurable threshold. We > especially want to do this for RPC tasks. Use a separate threshold for > warning about stuck RPC tasks versus other types of tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18248) Warn if monitored task has been tied up beyond a configurable threshold
[ https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056962#comment-16056962 ] Hadoop QA commented on HBASE-18248: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 47s {color} | {color:red} hbase-server in master has 12 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 49s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha3. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 121m 49s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 165m 11s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestMasterProcedureWalLease | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12873770/HBASE-18248.patch | | JIRA Issue | HBASE-18248 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux a3b92bd6f800 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / 5b485d1 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7264/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7264/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7264/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7264/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7264/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This