[ 
https://issues.apache.org/jira/browse/SAMZA-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958437#comment-13958437
 ] 

Zhijie Shen commented on SAMZA-58:
----------------------------------

bq. Yeah, I wasn't sure what to do there. YARN doesn't really handle infinitely 
running jobs well (or at least it didn't at the time I wrote this). Do you have 
any idea what the recommended approach is now?

MapReduce job computes the progress as following: 1. dividing a job into 
several phases and each phase has a weight; 2. in Map/Reduce phase, the 
sub-progress = # completed tasks / # total tasks. I'm not sure whether it is a 
good idea to let progress = completedTasks / taskCount in SamzaAppMasterState

> Use YARN's AMRMClientAsync client library
> -----------------------------------------
>
>                 Key: SAMZA-58
>                 URL: https://issues.apache.org/jira/browse/SAMZA-58
>             Project: Samza
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>            Assignee: Zhijie Shen
>         Attachments: SAMZA-58.1.patch
>
>
> YARN 2.2.0 has a nice Async API for clients and AMs. This API didn't exist 
> when we did the initial YARN integration for Samza. We should upgrade Samza 
> to use these new APIs.
> The API is loosely based off Samza's own AM code, so we can probably strip 
> out a lot of it (YarnAppMaster, mainly), and switch everything over to the 
> call-back based API.
> For details, see:
> https://issues.apache.org/jira/browse/YARN-417
> This new API is used in DistributedShell now, so we can use that for testing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to