[
https://issues.apache.org/jira/browse/SAMZA-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958437#comment-13958437
]
Zhijie Shen commented on SAMZA-58:
----------------------------------
bq. Yeah, I wasn't sure what to do there. YARN doesn't really handle infinitely
running jobs well (or at least it didn't at the time I wrote this). Do you have
any idea what the recommended approach is now?
MapReduce job computes the progress as following: 1. dividing a job into
several phases and each phase has a weight; 2. in Map/Reduce phase, the
sub-progress = # completed tasks / # total tasks. I'm not sure whether it is a
good idea to let progress = completedTasks / taskCount in SamzaAppMasterState
> Use YARN's AMRMClientAsync client library
> -----------------------------------------
>
> Key: SAMZA-58
> URL: https://issues.apache.org/jira/browse/SAMZA-58
> Project: Samza
> Issue Type: Bug
> Components: yarn
> Affects Versions: 0.6.0
> Reporter: Chris Riccomini
> Assignee: Zhijie Shen
> Attachments: SAMZA-58.1.patch
>
>
> YARN 2.2.0 has a nice Async API for clients and AMs. This API didn't exist
> when we did the initial YARN integration for Samza. We should upgrade Samza
> to use these new APIs.
> The API is loosely based off Samza's own AM code, so we can probably strip
> out a lot of it (YarnAppMaster, mainly), and switch everything over to the
> call-back based API.
> For details, see:
> https://issues.apache.org/jira/browse/YARN-417
> This new API is used in DistributedShell now, so we can use that for testing.
--
This message was sent by Atlassian JIRA
(v6.2#6252)