[ 
https://issues.apache.org/jira/browse/HIVE-28112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-28112:
-----------------------------------

    Assignee: László Bodor

> Clear dagId from MDC/NDC when re-executing the query with new dagId
> -------------------------------------------------------------------
>
>                 Key: HIVE-28112
>                 URL: https://issues.apache.org/jira/browse/HIVE-28112
>             Project: Hive
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>
> 1. dag fails (dag_1709708735265_0007_94, 
> hive_20240307162529_5ae040ec-7d46-4d79-9730-f4cd3f184ced)
> {code}
> <14>1 2024-03-07T16:29:54.292Z hiveserver2-0 hiveserver2 1 
> 87c9f023-150f-4923-b044-3b4207506951 [mdc@18060 class="client.DAGClientImpl" 
> dagId="dag_1709708735265_0007_94" level="INFO" operationLogLevel="EXECUTION" 
> queryId="hive_20240307162529_5ae040ec-7d46-4d79-9730-f4cd3f184ced" 
> sessionId="0ac1f6bf-98ea-4442-bb38-f5da5a36aeab" 
> thread="HiveServer2-Background-Pool: Thread-1432633"] DAG completed. 
> FinalState=FAILED
> {code}
> 2. AM lost plugin decides to re-execute:
> {code}
> <14>1 2024-03-07T16:29:54.301Z hiveserver2-0 hiveserver2 1 
> 87c9f023-150f-4923-b044-3b4207506951 [mdc@18060 
> class="reexec.ReExecuteLostAMQueryPlugin" dagId="dag_1709708735265_0007_94" 
> level="INFO" operationLogLevel="EXECUTION" 
> queryId="hive_20240307162529_5ae040ec-7d46-4d79-9730-f4cd3f184ced" 
> sessionId="0ac1f6bf-98ea-4442-bb38-f5da5a36aeab" 
> thread="HiveServer2-Background-Pool: Thread-1432633"] Got exception message: 
> AM record not found (likely died) in zookeeper for application id: 
> application_1709708735265_0007 retryPossible: true
> {code}
> 3. there are messages, that belong to a new execution (when there is no DAG 
> at all), still showing the last dagId, which is confusing, e.g:
> {code}
> <14>1 2024-03-07T16:29:54.348Z hiveserver2-0 hiveserver2 1 
> 87c9f023-150f-4923-b044-3b4207506951 [mdc@18060 class="ql.Driver" 
> dagId="dag_1709708735265_0007_94" level="INFO" operationLogLevel="EXECUTION" 
> queryId="hive_20240307162529_5ae040ec-7d46-4d79-9730-f4cd3f184ced" 
> sessionId="0ac1f6bf-98ea-4442-bb38-f5da5a36aeab" 
> thread="HiveServer2-Background-Pool: Thread-1432633"] Compiling command(...
> {code}
> while compiling a query, a dag id is not even supposed to be present, dag id 
> is got upon dag submission from the am
> the new dag id will correspond to the same hive query id, so hive query id 
> can be used to keep the connection between query attempts
> the last message, that makes sense for the failed/last dag id is:
> {code}
> <14>1 2024-03-07T16:29:54.309Z hiveserver2-0 hiveserver2 1 
> 87c9f023-150f-4923-b044-3b4207506951 [mdc@18060 class="reexec.ReExecDriver" 
> dagId="dag_1709708735265_0007_94" level="INFO" operationLogLevel="EXECUTION" 
> queryId="hive_20240307162529_5ae040ec-7d46-4d79-9730-f4cd3f184ced" 
> sessionId="0ac1f6bf-98ea-4442-bb38-f5da5a36aeab" 
> thread="HiveServer2-Background-Pool: Thread-1432633"] Preparing to re-execute 
> query
> {code} 
> so we might want to delete the dagId from MDC/NDC around this point: 
> "Preparing to re-execute query"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to