[ 
https://issues.apache.org/jira/browse/TEZ-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295237#comment-17295237
 ] 

Prabhu Joseph edited comment on TEZ-3820 at 3/4/21, 12:08 PM:
--------------------------------------------------------------

Thanks [~rohithsharma] for the patch. Have rebased the patch and tested on 
tez-0.10 + hadoop-3.2.1. The Job hangs with Task Containers stuck in ACQUIRED 
state as the CallbackHandler is Null in Tez DagAppMaster (AM). 
ATSV2HistoryLoggingService creates the TezAMRMClientAsync instance before 
YarnTaskSchedulerService with CallbackHandler set to Null.

The below assumption is going wrong.

{code}
    // assumption is at this point AMRMClient is created! So callback handler 
is set to null
   TezAMRMClientAsyncProvider.createAMRMClientAsync(1000, null)
{code}

Below are some ways to handle this:

1. YarnTaskSchedulerService has to create the instance which 
ATSV2HistoryLoggingService has to get it when available.

2. Move Callbackhandler code from YarnTaskSchedulerService to separate class so 
that either YarnTaskSchedulerService or ATSV2HistoryLoggingService can create 
proper instance instead of ATSV2HistoryLoggingService creating with Null 
Callbackhandler.




was (Author: prabhu joseph):
Thanks [~rohithsharma] for the patch. Have rebased the patch and tested on 
tez-0.10 + hadoop-3.2.1. The Job hangs with Task Containers stuck in ACQUIRED 
state as the CallbackHandler is Null in Tez DagAppMaster (AM). 
ATSV2HistoryLoggingService creates the TezAMRMClientAsync instance before 
YarnTaskSchedulerService with CallbackHandler set to Null.

The below assumption is going wrong.

+      // assumption is at this point AMRMClient is created! So callback 
handler is set to null
+      TezAMRMClientAsyncProvider.createAMRMClientAsync(1000, null)

Below are some ways to handle this:

1. YarnTaskSchedulerService has to create the instance which 
ATSV2HistoryLoggingService has to get it when available.

2. Move Callbackhandler code from YarnTaskSchedulerService to separate class so 
that either YarnTaskSchedulerService or ATSV2HistoryLoggingService can create 
proper instance instead of ATSV2HistoryLoggingService creating with Null 
Callbackhandler.



> Plugin to write history events to ATSv2
> ---------------------------------------
>
>                 Key: TEZ-3820
>                 URL: https://issues.apache.org/jira/browse/TEZ-3820
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>            Priority: Major
>         Attachments: TEZ-3884.001.patch
>
>
> YARN Timeline Service v.2 is the next major iteration of Timeline Server, 
> following v.1 and v.1.5. ATSV.2 is created to address two major challenges of 
> v.1 i.e Scalability and Usability improvements. Refer 
> [ATSv2-doc|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html#Timeline_Service_v.2_REST_API]
> It would be nice to use ATSv2 for Tez which solves scalability problems. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to