[
https://issues.apache.org/jira/browse/HIVE-25049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on HIVE-25049 started by Panagiotis Garefalakis.
-----------------------------------------------------
> LlapDaemon preemption should not be triggered for same Vertex tasks
> -------------------------------------------------------------------
>
> Key: HIVE-25049
> URL: https://issues.apache.org/jira/browse/HIVE-25049
> Project: Hive
> Issue Type: Sub-task
> Reporter: Panagiotis Garefalakis
> Assignee: Panagiotis Garefalakis
> Priority: Major
> Attachments: lipwig-output3799058789917393195.svg
>
>
> Due to the asynchronous nature of QueryInfo$FinishableState notification, we
> usually end up receiving finishable state updates across tasks/queryfragments
> with some time difference.
> Imagine vertex Map8 with dependency on Map7.
> If Map8 vertex is already running with some tasks still pending, we can
> end-up in a situation where on Map7 completion, some of the pending Map8
> tasks are getting the finishable state update BEFORE the already running Map8
> tasks, ending up preempting tasks for no reason!
> {code:java}
> 2021-04-22T15:30:45.124Z source:Map 7 updated, notifying: [Map 8]
> 2021-04-22T15:30:45.125Z query-executor-0 class="impl.TaskExecutorService"
> level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received
> finishable state update for attempt_1619105382691_0001_1_05_000014_0,
> state=true
> 2021-04-22T15:30:45.125Z query-executor-0
> class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server
> handler 3 on 25000"] Now notifying: Map 8
> 2021-04-22T15:30:45.125Z query-executor-0 class="impl.TaskExecutorService"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Attempting to execute
> TaskWrapper{task=attempt_1619105382691_0001_1_05_000014_0, Vertex=Map 8,
> inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true,
> canFinish=true, canFinish(in
> 2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Task
> TaskWrapper{task=attempt_1619105382691_0001_1_05_000014_0, Vertex=Map 8,
> inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true,
> canFinish=true, canFinish(in queue)=true, isGuaranteed=false,
> firstAttemptStartTime=1619105437749, dagStartTime=1619105422608,
> withinDagPriority=74, vertexParallelism= 232, selfAndUpstreamParallelism=
> 256, selfAndUpstreamComplete= 17} managed to preempt task
> TaskWrapper{task=attempt_1619105382691_0001_1_05_000006_0, Vertex=Map 8,
> inWaitQueue=false, inPreemptionQueue=true, registeredForNotifications=true,
> canFinish=true, canFinish(in queue)=false, isGuaranteed=false,
> firstAttemptStartTime=1619105437737, dagStartTime=1619105422608,
> withinDagPriority=74, vertexParallelism= 232, selfAndUpstreamParallelism=
> 256, selfAndUpstreamComplete= 15}
> 2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Invoking kill task for
> attempt_1619105382691_0001_1_05_000006_0 due to pre-emption to run
> attempt_1619105382691_0001_1_05_000014_0
> 2021-04-22T15:30:45.126Z query-executor-0
> class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server
> handler 3 on 25000"] Now notifying: Map 8
> 2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService"
> level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received
> finishable state update for attempt_1619105382691_0001_1_05_000011_0,
> state=true
> 2021-04-22T15:30:45.127Z query-executor-0 class="impl.TaskRunnerCallable"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Kill task requested for
> id=attempt_1619105382691_0001_1_05_000006_0, taskRunnerSetup=true
> 2021-04-22T15:30:45.127Z query-executor-0 class="impl.TaskRunnerCallable"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Issuing kill to task
> attempt_1619105382691_0001_1_05_000006_0
> 2021-04-22T15:30:45.127Z query-executor-0
> class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server
> handler 3 on 25000"] Now notifying: Map 8
> 2021-04-22T15:30:45.127Z query-executor-0 class="task.TezTaskRunner2"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Attempting to abort
> attempt_1619105382691_0001_1_05_000006_0 due to an invocation of killTask
> 2021-04-22T15:30:45.128Z query-executor-0 class="tez.TezProcessor"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Received abort
> 2021-04-22T15:30:45.128Z query-executor-0 class="tez.TezProcessor"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Forwarding abort to
> RecordProcessor
> 2021-04-22T15:30:45.128Z query-executor-0
> class="runtime.LogicalIOProcessorRuntimeTask"
> dagId="dag_1619105382691_0001_1"
> fragmentId="1619105382691_0001_1_05_000011_0" level="INFO"
> queryId="hive_20210422153013_397b96bf-d5a6-493a-9c51-9446f64eeed4"
> thread="TezTR-382691_1_1_5_11_0"] Waiting for 1 initializers to finish
> 2021-04-22T15:30:45.128Z query-executor-0 class="tez.MapRecordProcessor"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Forwarding abort to mapOp: {}
> MAP
> 2021-04-22T15:30:45.128Z query-executor-0 class="vector.VectorMapOperator"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Received abort in operator: MAP
> 2021-04-22T15:30:45.128Z query-executor-0 class="impl.TaskRunnerCallable"
> level="INFO" thread="Wait-Queue-Scheduler-0"] Kill request for task
> attempt_1619105382691_0001_1_05_000006_0 completed. Informing AM
> 2021-04-22T15:30:45.128Z query-executor-0 class="impl.TaskExecutorService"
> level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received
> finishable state update for attempt_1619105382691_0001_1_05_000006_0,
> state=true
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)