dnishimura opened a new pull request #1156: SAMZA-2323: Provide option allow 
single containers to fail without failing the job
URL: https://github.com/apache/samza/pull/1156
 
 
   This PR introduces a new config: 
`cluster-manager.container.fail.job.after.retries`. The default value (`true`) 
does not change existing behavior. However, if set false and a container fails 
after exhausting all its retries, the job will continue to run in a degraded 
state. 
   
   This will allow healthy containers to continue while debugging the issue 
with the failed container.
   
   @rmatharu @Sanil15 - Please take a look.
   CC: @prateekm 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to