[
https://issues.apache.org/jira/browse/SAMZA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prateek Maheshwari updated SAMZA-1181:
--------------------------------------
Fix Version/s: 0.13.0
> Fix AppMaster hang after submitting jobs to Yarn
> ------------------------------------------------
>
> Key: SAMZA-1181
> URL: https://issues.apache.org/jira/browse/SAMZA-1181
> Project: Samza
> Issue Type: Bug
> Affects Versions: 0.13.0
> Reporter: Xinyu Liu
> Assignee: Shanthoosh Venkataraman
> Priority: Blocker
> Fix For: 0.13.0
>
>
> Currently when a job is submitted to Yarn, it is going to hang after
> AppMaster is created. The log shows that it hangs during bootstrapping from
> Coordinator stream. Further debugging shows that the jobs hang in the second
> time of bootstrap while reading locality data from LocalityManager. The
> sequence is the following:
> 1. JobModelManager creates CoordinatorStreamConsumer, and bootstrap it,
> 2. LocalityManager writes locality info into coordinator stream
> 3. JobModelManager closes CoordinatorStreamConsumer (*)
> 4. Later localityManager bootstraps CoordinatorStreamConsumer again
> Step 3 is the problem here. Since CoordinatorStreamConsumer is still held by
> LocalityManager, it cannot be closed prematurely. Step 3 is introduced in
> SAMZA-1154, as a refactoring of JobModelManager for task rest end point. To
> fix this issue, we will revert this change of step 3.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)