Mustafa Iman created HIVE-23744:
-----------------------------------

             Summary: Reduce query startup latency
                 Key: HIVE-23744
                 URL: https://issues.apache.org/jira/browse/HIVE-23744
             Project: Hive
          Issue Type: Task
          Components: llap
    Affects Versions: 4.0.0
            Reporter: Mustafa Iman
            Assignee: Mustafa Iman
         Attachments: am_schedule_and_transmit.png, task_start.png

When I run queries with large number of tasks for a single vertex, I see a 
significant delay before all tasks start execution in llap daemons. 

Although llap daemons have the free capacity to run the tasks, it takes a 
significant time to schedule all the tasks in AM and actually transmit them to 
executors.

"am_schedule_and_transmit" shows scheduling of tasks of tpcds query 55. It 
shows only the tasks scheduled for one of 10 llap daemons. The scheduler works 
in a single thread, scheduling tasks one by one. A delay in scheduling of one 
task, delays all the tasks.

!am_schedule_and_transmit.png|width=831,height=573!

 

Another issue is that it takes long time to fill all the execution slots in 
llap daemons even though they are all empty initially. This is caused by 
LlapTaskCommunicator using a fixed number of threads (10 by default) to send 
the tasks to daemons. Also this communication is synchronized so these threads 
block communication staying idle. "task_start.png" shows running tasks on an 
llap daemon that has 12 execution slots. By the time 12th task starts running, 
more than 100ms already passes. That slot stays idle all this time. 

!task_start.png|width=1166,height=635!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to