[ 
https://issues.apache.org/jira/browse/HIVE-28617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-28617:
-----------------------------------

    Assignee: László Bodor

> Display MoveTask's duration on the query summary
> ------------------------------------------------
>
>                 Key: HIVE-28617
>                 URL: https://issues.apache.org/jira/browse/HIVE-28617
>             Project: Hive
>          Issue Type: Bug
>      Security Level: Public(Viewable by anyone) 
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>
> in case of an IOW sometimes the MoveTask takes as much as the original DAG, 
> like:
> {code}
> INFO  : Run DAG                              2125.73s
> ...
> VERTICES: 04/04  [==========================>>] 100%  ELAPSED TIME: 3193.96 s
> {code} 
> I can see this thread in HS2 after the end of DAG:
> {code}
> "HiveServer2-Background-Pool: Thread-33224" #33224 prio=5 os_prio=0 
> cpu=1956.78ms elapsed=3190.58s tid=0x00007f2b92e4e800 nid=0x1a6d9 waiting on 
> condition  [0x00007f2b7c840000]
>    java.lang.Thread.State: WAITING (parking)
>       at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
>       - parking to wait for  <0x00000006015666f0> (a 
> java.util.concurrent.FutureTask)
>       at 
> java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194)
>       at 
> java.util.concurrent.FutureTask.awaitDone([email protected]/FutureTask.java:447)
>       at 
> java.util.concurrent.FutureTask.get([email protected]/FutureTask.java:190)
>       at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:5214)
>       at 
> org.apache.hadoop.hive.ql.exec.MoveTask.moveFileInDfs(MoveTask.java:237)
>       at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:192)
>       at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:438)
>       at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
>       at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
>       at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:356)
>       at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:329)
>       at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
>       at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:107)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:809)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:546)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:540)
>       at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:190)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:92)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
>       at java.security.AccessController.doPrivileged([email protected]/Native 
> Method)
>       at javax.security.auth.Subject.doAs([email protected]/Subject.java:423)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:360)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:515)
>       at 
> java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:515)
>       at 
> java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1128)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
>       at java.lang.Thread.run([email protected]/Thread.java:829)
> {code}
> in the HS2 log it's displayed like:
> {code}
> hiveserver2 <15>1 2024-11-07T16:09:03.534Z hiveserver2-0 hiveserver2 1 
> 7e9f692b-2bff-487f-92a6-a41312e15850 [mdc@38374 class="log.PerfLogger" 
> dagId="dag_1730992536558_0006_1" level="DEBUG" operationLogLevel="EXECUTION" 
> queryId="hive_20241107151538_c450a4aa-0e0f-445a-a080-45fb12b84e93" 
> sessionId="9c37f653-3e22-4276-9189-cfec9382abc5" 
> thread="HiveServer2-Background-Pool: Thread-33224"] </PERFLOG 
> method=FileMoves start=1730995520789 end=1730995743534 duration=222745 
> from=MoveTask>
> {code}
> I think the summary should be extended to:
> {code}
> INFO  : Query Execution Summary
> INFO  : 
> ----------------------------------------------------------------------------------------------
> INFO  : OPERATION                            DURATION
> INFO  : 
> ----------------------------------------------------------------------------------------------
> INFO  : Compile Query                           0.47s
> INFO  : Prepare Plan                            0.15s
> INFO  : Get Query Coordinator (AM)              0.07s
> INFO  : Submit Plan                             0.01s
> INFO  : Start DAG                               0.01s
> INFO  : Run DAG                              1616.78s
> INFO  : Moving Files                        1000s
> INFO  : 
> ----------------------------------------------------------------------------------------------
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to