[
https://issues.apache.org/jira/browse/HIVE-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Vary reassigned HIVE-24467:
---------------------------------
Assignee: Xi Chen (was: guojh)
> ConditionalTask remove tasks that not selected exists thread safety problem
> ---------------------------------------------------------------------------
>
> Key: HIVE-24467
> URL: https://issues.apache.org/jira/browse/HIVE-24467
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 1.1.0, 2.3.4, 3.1.2
> Reporter: guojh
> Assignee: Xi Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> When hive execute jobs in parallel(control by “hive.exec.parallel”
> parameter), ConditionalTasks remove the tasks that not selected in parallel,
> because there are thread safety issues, some task may not remove from the
> dependent task tree. This is a very serious bug, which causes some stage task
> not trigger execution.
> In our production cluster, the query run three conditional task in parallel,
> after apply the patch of HIVE-21638, we found Stage-3 is miss and not submit
> to runnable list for his parent Stage-31 is not done. But Stage-31 should
> removed for it not selected.
> Stage dependencies is below:
> {code:java}
> STAGE DEPENDENCIES:
> Stage-41 is a root stage
> Stage-26 depends on stages: Stage-41
> Stage-25 depends on stages: Stage-26 , consists of Stage-39, Stage-40,
> Stage-2
> Stage-39 has a backup stage: Stage-2
> Stage-23 depends on stages: Stage-39
> Stage-3 depends on stages: Stage-2, Stage-12, Stage-16, Stage-20, Stage-23,
> Stage-24, Stage-27, Stage-28, Stage-31, Stage-32, Stage-35, Stage-36
> Stage-8 depends on stages: Stage-3 , consists of Stage-5, Stage-4, Stage-6
> Stage-5
> Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
> Stage-51 depends on stages: Stage-0
> Stage-4
> Stage-6
> Stage-7 depends on stages: Stage-6
> Stage-40 has a backup stage: Stage-2
> Stage-24 depends on stages: Stage-40
> Stage-2
> Stage-44 is a root stage
> Stage-30 depends on stages: Stage-44
> Stage-29 depends on stages: Stage-30 , consists of Stage-42, Stage-43,
> Stage-12
> Stage-42 has a backup stage: Stage-12
> Stage-27 depends on stages: Stage-42
> Stage-43 has a backup stage: Stage-12
> Stage-28 depends on stages: Stage-43
> Stage-12
> Stage-47 is a root stage
> Stage-34 depends on stages: Stage-47
> Stage-33 depends on stages: Stage-34 , consists of Stage-45, Stage-46,
> Stage-16
> Stage-45 has a backup stage: Stage-16
> Stage-31 depends on stages: Stage-45
> Stage-46 has a backup stage: Stage-16
> Stage-32 depends on stages: Stage-46
> Stage-16
> Stage-50 is a root stage
> Stage-38 depends on stages: Stage-50
> Stage-37 depends on stages: Stage-38 , consists of Stage-48, Stage-49,
> Stage-20
> Stage-48 has a backup stage: Stage-20
> Stage-35 depends on stages: Stage-48
> Stage-49 has a backup stage: Stage-20
> Stage-36 depends on stages: Stage-49
> Stage-20
> {code}
> Stage tasks execute log is below, we can see Stage-33 is conditional task and
> it consists of Stage-45, Stage-46, Stage-16, Stage-16 is launched, Stage-45
> and Stage-46 should remove from the dependent tree, Stage-31 is child of
> Stage-45 parent of Stage-3, So, Stage-31 should removed too. As see in the
> below log, we find Stage-31 is still in the parent list of Stage-3, this
> should not happend.
> {code:java}
> 2020-12-03T01:09:50,939 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 1 out of 17
> 2020-12-03T01:09:50,940 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-26:MAPRED] in parallel
> 2020-12-03T01:09:50,941 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 2 out of 17
> 2020-12-03T01:09:50,943 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-30:MAPRED] in parallel
> 2020-12-03T01:09:50,943 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 3 out of 17
> 2020-12-03T01:09:50,943 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-34:MAPRED] in parallel
> 2020-12-03T01:09:50,944 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 4 out of 17
> 2020-12-03T01:09:50,944 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-38:MAPRED] in parallel
> 2020-12-03T01:10:32,946 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-29:CONDITIONAL] in parallel
> 2020-12-03T01:10:32,946 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-33:CONDITIONAL] in parallel
> 2020-12-03T01:10:32,946 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-37:CONDITIONAL] in parallel
> 2020-12-03T01:10:34,946 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 5 out of 17
> 2020-12-03T01:10:34,947 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-16:MAPRED] in parallel
> 2020-12-03T01:10:34,948 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 6 out of 17
> 2020-12-03T01:10:34,948 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-12:MAPRED] in parallel
> 2020-12-03T01:10:34,949 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 7 out of 17
> 2020-12-03T01:10:34,950 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-20:MAPRED] in parallel
> 2020-12-03T01:10:34,950 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-25:CONDITIONAL] in parallel
> 2020-12-03T01:10:36,950 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Launching Job 8 out of 17
> 2020-12-03T01:10:36,951 INFO [HiveServer2-Background-Pool: Thread-87372]
> ql.Driver: Starting task [Stage-2:MAPRED] in parallel
> 2020-12-01T22:20:17,774 INFO [HiveServer2-Background-Pool: Thread-233156]
> ql.Driver: Task:Stage-3:MAPRED Parent:Stage-31:MAPRED isDone:false
> 2020-12-01T22:20:17,774 ERROR [HiveServer2-Background-Pool: Thread-233156]
> ql.Driver: Miss stage: Stage-3for queryid
> hive_20201201221740_805852c0-60a7-4141-96e9-196f83b2705e
> 2020-12-01T22:20:17,774 ERROR [HiveServer2-Background-Pool: Thread-233156]
> ql.Driver: Miss stage for queryid
> hive_20201201221740_805852c0-60a7-4141-96e9-196f83b2705e : FAILED: Some
> Execute Stage miss error
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)