[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109402#comment-17109402 ] Hive QA commented on HIVE-23443: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/13003103/HIVE-23443.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 17269 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/22408/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/22408/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-22408/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 13003103 - PreCommit-HIVE-Build > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, > HIVE-23443.3.patch > > Time Spent: 40m > Remaining Estimate: 0h > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109387#comment-17109387 ] Hive QA commented on HIVE-23443: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 49s{color} | {color:blue} llap-server in master has 88 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s{color} | {color:red} llap-server: The patch generated 3 new + 82 unchanged - 0 fixed = 85 total (was 82) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-22408/dev-support/hive-personality.sh | | git revision | master / 5c9fa2a | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-22408/yetus/diff-checkstyle-llap-server.txt | | modules | C: llap-server U: llap-server | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-22408/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, > HIVE-23443.3.patch > > Time Spent: 40m > Remaining Estimate: 0h > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108798#comment-17108798 ] Prasanth Jayachandran commented on HIVE-23443: -- [~pgaref] non-finishable to finishable is not a problem. But there is concern in the line that you pinged in PR that double/multiple addition could be possible with pre-emption queue and I was able to unit test it. Could you look at the diff in PR again? > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch > > Time Spent: 40m > Remaining Estimate: 0h > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108523#comment-17108523 ] Panagiotis Garefalakis commented on HIVE-23443: --- Hey [~prasanth_j] latest changes LGTM -- my only concern is if there can be a case where we have a Guaranteed task that changes from non-finishable to finishable and is only part of the preemptionQueue -- in that case the task will be left hanging. It seems that neither the older or the latest changes take care of that scenario. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch > > Time Spent: 10m > Remaining Estimate: 0h > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108150#comment-17108150 ] Hive QA commented on HIVE-23443: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/13002973/HIVE-23443.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17270 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_materialized_view_rewrite_ssb] (batchId=128) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/22352/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/22352/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-22352/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 13002973 - PreCommit-HIVE-Build > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108124#comment-17108124 ] Hive QA commented on HIVE-23443: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 52s{color} | {color:blue} llap-server in master has 87 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s{color} | {color:red} llap-server: The patch generated 1 new + 82 unchanged - 0 fixed = 83 total (was 82) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-22352/dev-support/hive-personality.sh | | git revision | master / 390ad7d | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-22352/yetus/diff-checkstyle-llap-server.txt | | modules | C: llap-server U: llap-server | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-22352/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107565#comment-17107565 ] Prasanth Jayachandran commented on HIVE-23443: -- I was able to repro the issue with unit test. Included that in .2 patch. [~pgaref] The guaranteed updates is hairy piece to touch for now, so not doing it in this ticket. .2 patch is same as .1 with added junit tests. Could you please take a look? > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107490#comment-17107490 ] Prasanth Jayachandran commented on HIVE-23443: -- Created HIVE-23472 to handle the guaranteed state update which is tied to WLM. For now keeping the WLM issue separate and will be handled in HIVE-23472. In this ticket I will specifically handle the finishable state updates. Will add more unit tests to the .1 patch. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105809#comment-17105809 ] Gopal Vijayaraghavan commented on HIVE-23443: - bq. 1) If guaranteed or finishable, the task should not be in pre-emption queue This is very suspiciously like the pre-WLM model (which we know works without deadlocks), but there's no way to fix pool based preemption if a speculative finishable takes over a whole cluster. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105308#comment-17105308 ] Panagiotis Garefalakis commented on HIVE-23443: --- Hey [~prasanth_j] the described logic does make sense to me – we should probably write it down with capital letters somewhere in TaskExecutorService class :) Anyway, left some comments in the PR let me know what you think > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105031#comment-17105031 ] Prasanth Jayachandran commented on HIVE-23443: -- [~gopalv]/[~pgaref] I simplified logic of pre-emption queue handling to the following 2 conditions 1) If guaranteed or finishable, the task should not be in pre-emption queue 2) if speculative or non-finishable, the task should be in pre-emption queue I hope I am not missing any other conditions. Could you please take another look? [~pgaref] i changed the test cases based on the above conditions. Let me know if I missed any case. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104978#comment-17104978 ] Hive QA commented on HIVE-23443: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/13002654/HIVE-23443.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17253 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries_with_filters] (batchId=101) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/22275/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/22275/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-22275/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 13002654 - PreCommit-HIVE-Build > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104958#comment-17104958 ] Hive QA commented on HIVE-23443: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 41s{color} | {color:blue} llap-server in master has 87 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 14m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-22275/dev-support/hive-personality.sh | | git revision | master / ee4daec | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: llap-server U: llap-server | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-22275/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104926#comment-17104926 ] Prasanth Jayachandran commented on HIVE-23443: -- Good catch. I will update the PR and pull in the test case. Thanks! > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104921#comment-17104921 ] Panagiotis Garefalakis commented on HIVE-23443: --- Btw tried to capture the scenario in a test-case here: [https://github.com/apache/hive/pull/1013/files] feel free to use it > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104915#comment-17104915 ] Panagiotis Garefalakis commented on HIVE-23443: --- Hey [~prasanth_j] I believe updateFragment method will also need similar changes. [https://github.com/apache/hive/blob/744e80cd9d52ac3808bde2d8181adddbeb776ed2/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L628] > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104908#comment-17104908 ] Gopal Vijayaraghavan commented on HIVE-23443: - LGTM - +1 Will wait for the scale tests, not just the qtests. > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23443) LLAP speculative task pre-emption seems to be not working
[ https://issues.apache.org/jira/browse/HIVE-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104868#comment-17104868 ] Prasanth Jayachandran commented on HIVE-23443: -- The patch is still pending testing with some workloads where the issue is reproducible. I will update here once it is done. The patch is ready for review though. cc/ [~gopalv] [~rbalamohan] [~pgaref] > LLAP speculative task pre-emption seems to be not working > - > > Key: HIVE-23443 > URL: https://issues.apache.org/jira/browse/HIVE-23443 > Project: Hive > Issue Type: Bug >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-23443.1.patch > > > I think after HIVE-23210 we are getting a stable sort order and it is causing > pre-emption to not work in certain cases. > {code:java} > "attempt_1589167813851__119_01_08_0 > (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started > at 2020-05-11 05:59:22, in preemption queue, can finish)", > "attempt_1589167813851_0008_84_01_08_1 > (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started > at 2020-05-11 06:00:23, in preemption queue, can finish)" {code} > Scheduler only peek's at the pre-emption queue and looks at whether it is > non-finishable. > [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420] > In the above case, all tasks are speculative but state change is not > triggering pre-emption queue re-ordering so peek() always returns canFinish > task even though non-finishable tasks are in the queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)