[ 
https://issues.apache.org/jira/browse/TS-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711545#comment-14711545
 ] 

James Peach commented on TS-3848:
---------------------------------

OK, running with partial cache is a feature. We can improve sending alarms, but 
I do not think it is reasonable to stop serving traffic if a cache disk fails. 
On startup, the most reasonable approaches seem to be to improve sending alarms 
(preferred) or to extend ```wait_for_cache`` with a value to require that all 
disks initialize.

You can also just deal with this in monitoring. There are metrics containing 
timestamps for startup and cache initialization, and also disk failures. If you 
monitor appropriately you can take the right administrative action.

> ATS runs without cache or partial cache on disk errors
> ------------------------------------------------------
>
>                 Key: TS-3848
>                 URL: https://issues.apache.org/jira/browse/TS-3848
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Cache
>            Reporter: Pushkar Pradhan
>            Assignee: Alan M. Carroll
>             Fix For: 6.1.0
>
>
> Problem:
> If ATS fails to initialize one or more disks it continues to run without 
> cache. This can cause origin overload.
> The situation can be somewhat mitigated by setting 
> proxy.config.http.wait_for_cache = 1 and if none of the disks failed to 
> initialize.
> However, even if wait_for_cache = 1 and only one or a few disks failed to 
> initialize, ATS will continue to serve traffic. 
> Proposed Solution:
> Define a new variable: proxy.config.http.cache.required
> Value range: 0-2
> 0 (default) - Do nothing
> 1 - Abort trafficserver if it failed to initialize all the disks/volumes
> 2 - Abort trafficserver if it failed to initialize even one of the disks or 
> volumes.
> If proxy.config.http.cache.required = 1 and proxy.config.http.wait_for_cache 
> = 1 and if proxy.config.http.cache.required > 0 then abort the traffic server 
> if one or more cache disks/volumes could not be initialized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to