[ 
https://issues.apache.org/jira/browse/TEZ-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17581623#comment-17581623
 ] 

zhangbutao commented on TEZ-4445:
---------------------------------

PR: https://github.com/apache/tez/pull/238

> Tez task can get stuck when waiting for all initializers on 
> LogicalIOProcessorRuntimeTask:initialize
> ----------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-4445
>                 URL: https://issues.apache.org/jira/browse/TEZ-4445
>             Project: Apache Tez
>          Issue Type: Improvement
>    Affects Versions: 0.10.2
>            Reporter: zhangbutao
>            Assignee: zhangbutao
>            Priority: Major
>         Attachments: 
> Tez-task-stuck-LogicalIOProcessorRuntimeTask-initialize.jpg
>
>
> Cluster environment: Haoop 3.1.0, Hive 3.1.0, Tez 0.9.2
> In a busy cluster, i find some tez tasks can get stuck on 
> LogicalIOProcessorRuntimeTask:initialize and wait for all initializers to be 
> finished. This bad tez task can cause entire tez job to run forever. If i 
> kill the tez job and resubmit it, the job often can run successfully. Please 
> see more infomation from task jstack attachement 
> _*Tez-task-stuck-LogicalIOProcessorRuntimeTask-initialize.jpg*_
> I have not find root cause which leaded to the task getting stuck, but i 
> think it is a good way to add a timeout when waiting for initializers. In 
> this way, the stuck task can be interupped  beyond a certain time, and the 
> attempt task can be launched immediately.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to