Thanks for your comment, Stephan. I have moved it into the doc to keep
discussion history in one place.

On Wed, May 6, 2015 at 1:33 AM, Erb, Stephan
<[email protected]> wrote:
> Hi Maxim,
>
> I am not keen on the potential risk of tasks getting stuck in STARTING. We 
> perform auto-scaling of jobs, so there might be nobody around to notice and 
> correct the problem in time.
>
> How about keeping the initial_interval_secs and just change its meaning to be 
> grace period, so that health checks are triggered but errors ignored during 
> this interval.
>
> The initial_interval_secs is then a user-configurable upper bound of when a 
> job is meant to be working. It can even be set rather high, because it won't 
> affect the update performance.
>
> What do you think?
>
> Best Regards,
> Stephan
> ________________________________________
> From: Maxim Khutornenko <[email protected]>
> Sent: Tuesday, May 5, 2015 10:24 PM
> To: [email protected]
> Subject: Health Checks for Updates design review
>
> Hi,
>
> I have put together a design proposal for improving health-enabled job
> update performance. Please, review and leave your comments:
>
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit
>
> Thanks,
> Maxim

Reply via email to