vjagadish1989 opened a new pull request #995: SAMZA-2165: Account for 
coordinator restarts in calls to status
URL: https://github.com/apache/samza/pull/995
 
 
   Currently status of a Samza job is determined by a combination of:
   1. Obtain YARN's status for the job by querying the RM
   2. Obtain the AM/coordinator URL for the job
   3. If (1) is "Running", Query the coordinator URL if all containers have 
started
   
   YARN may restart the coordinator between (2) and (3) and the old coordinator 
process may no longer be alive, triggering a ConnectException in (3). This 
causes the status-call to fail;
   
   A better alternative to handle these retriable errors is to return a "New" 
status from the API - so that applications can continue polling for status.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to