[ 
https://issues.apache.org/jira/browse/TS-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700421#comment-14700421
 ] 

Sudheer Vinukonda commented on TS-3837:
---------------------------------------

Below's a suggestion from [[email protected]] for a general mechanism to 
handle the error scenarios for optional features.

{code}
jpeach
4:02 thinking about the policy of puking if some startup resources are not 
present ...
4:02 see my comment on TS-3837
4:02 I think that alarms could be enhanced to handle this in a general way
4:03 it's always interesting to know when some resource can't be loaded, but it 
is not always appropriate to exit ... or at least that is different for each 
site
sudheerv
4:03 TS-3837
ASFBot
4:03 TS-3837: The setting wait_for_cache waits indefinitely even when there are 
no cache disks configured. - https://issues.apache.org/jira/browse/TS-3837
jpeach
4:03 so if you can know what happened via an alarm, then sites that care can 
take the right action
4:03 eg. log it
4:04 or kill traffic_server or update DNS, or whatever
sudheerv
4:04 yeah, I think a log is probably already generated
4:04 but, I do like the suggestion
4:04 as a general mechanism
jpeach
4:05 I think you'd need to enhance the alarms a bit to make them more useable, 
but you can already exec a script
4:05 and I am pretty uncomfortable with a lot of "proxy.config.exit_on_foo" 
parameters, which is the mose likelt alternative
sudheerv
4:09 yeah..that's not very clean
4:10 jpeach: one other concern is that logs/alarms are all very good, but, 
dcarlin prefers something proactive (like failing TS)
4:10 rather than having to manually take an action
4:10 it's 1000s of hosts
jpeach
4:10 yeh I think you could do that with an alarm
sudheerv
4:10 cool
jpeach
4:10 so the alarm is a script
sudheerv
4:11 ah, got it
jpeach
4:11 we could define a convention "halt if the script exits with status XXX"
4:11 or the script itself could make traffic_server stop
sudheerv
4:11 yeah
jpeach
4:12 consider even adding a new "alarms.config" file which specifies actions 
for each alarm type
4:12 just some food for thought :)
sudheerv
4:12 yeah, makes sense
4:12 could you pls add the details to the jira
sudheerv
4:13 I haven't used alarms much myself, so, would need to run an example to see 
how it works
4:13 but, i can see what you are suggesting..
jpeach
4:14 I don't think the current alarms implementation is sufficient, but I think 
that is a more promising direction for this sort of thing
sudheerv
4:15 i agree - we should have a set of alarms (fatal events) enumerated for 
each of the various optional features
4:16 and an action associated with each of those
4:16 i worked on a feature like that in my last work too..
4:16 so, each feature would raise that alarm (send an event) on failing to 
initialize or any other error condition it runs into
4:16 the raising alarm part is completely upto the feature..
4:17 but, when it wants to raise it, it'd send the alarm to some central place
4:17 which maps that alarm to a preconfigured action
4:17 that action can be made configurable/reloadable etc depending on the 
feature
4:17 is that what you had in mind?
jpeach
4:18 yeh, more or less in that direction
sudheerv
4:18 the individual features are also open to add more alarms/events that are 
applicable to them
4:18 cool
jpeach
4:18 the key point is to externalize the site policy
sudheerv
4:18 yeah, makes sense
jpeach
4:19 it makes monitoring easier for everyone, since we can create alarm 
programs that send to pagerduty or nagios etc
sudheerv
4:19 sure
4:19 it also allows for a general mechanism to customize actions for different 
features
4:20 rather than having to create multiple config flags
4:20 that part is more appealing to me :)
{code}

> The setting wait_for_cache waits indefinitely even when there are no cache 
> disks configured.
> --------------------------------------------------------------------------------------------
>
>                 Key: TS-3837
>                 URL: https://issues.apache.org/jira/browse/TS-3837
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Cache, HTTP
>    Affects Versions: 6.1.0
>            Reporter: Sudheer Vinukonda
>            Assignee: Alan M. Carroll
>             Fix For: 6.1.0
>
>
> The setting *proxy.config.http.wait_for_cache* allows to let traffic_server 
> wait for the cache to initialize before processing requests (it basically 
> blocks accepts). This is fine when cache is configured, but, if there are no 
> disks configured in *storage.config*, this setting makes requests wait 
> indefinitely. Ideally, the setting should consider cache initialized 
> (disabled) when no disks are configured and just proxy the requests rather 
> than block them forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to