I'd still like to get this merged. Avery: are you the current maintainer? I haven't seen Gerrit Pape on the list.
On Tue, Feb 17, 2015 at 4:49 PM, Buck Evan <[email protected]> wrote: > On Tue, Feb 17, 2015 at 4:20 PM, Avery Payne <[email protected]> > wrote: > > > > On 2/17/2015 11:02 AM, Buck Evan wrote: > >> > >> I think there's only three cases here: > >> > >> 1. Users that would have gotten immediate failure, and no amount of > >> spinning would help. These users will see their error delayed by $SVWAIT > >> seconds, but no other difference. > >> 2. Users that would have gotten immediate failure, but could have > gotten > >> a success within $SVWAIT seconds. All of these users will of course be > glad > >> of the change. > >> 3. Users that would not have gotten immediate failure. None of these > >> users will see the slightest change in behavior. > >> > >> Do you have a particular scenario in mind when you mention "breaking > lots > >> of existing installations elsewhere due to a default behavior change"? I > >> don't see that there is any case this change would break. > <snip> > > Thanks for the thoughtful reply Avery. My background is also > "maintaining business software", although putting it in those terms > gives me horrific visions of java servlets and soap protocols. > > > I have to look at it from a viewpoint of "what is everything else in the > system expecting when this code is called". This means thinking in terms > of code-as-API, so that calls elsewhere don't break. > > As a matter of API, sv-check does sometimes take up to $SVWAIT seconds to > fail. > Any caller to sv-check will be expecting this (strictly limited) > delay, in the exceptional case. > My patch just extends this existing, documented behavior to the > special case of "unable to open supervise/ok". > The API is unchanged, just the amount of time to return the result is > changed. > > > This happens because the use of "sv check (child)" follows the > convention of "check, and either succeed fast or fail fast", ... > > Either you're confused about what sv-check does, or I'm confused about > what you're saying. > sv-check generaly doesn't fail fast (except in the special case I'm > trying to make no longer fail fast -- svrun is not started). > Generally it will spin for $SVWAIT seconds before failing. > > > Without that fast-fail, the logged hint never occurs; the sysadmin now > has to figure out which of three possible services in a dependency chain > are causing the hang. > > Even if I put the above issue aside aside, you wouldn't get a hang, > you'd get the failure message you're familiar with, just several > seconds (default: 7) later. The sysadmin wouldn't search any more than > previously. He would however find that the system fails less often, > since it has that 7 seconds of tolerance now. This is how sv-check > behaves already when a ./check script exits nonzero. > > > > While this is > > implemented differently from other installations, there are known cases > > similar to what I am doing, where people have ./run scripts like this: > > > > #!/bin/sh > > sv check child-service || exit 1 > > exec parent-service > > This would still work just fine, just strictly more often. >
