On 03/09/2015 18:25, Buck Evan wrote:
An s6-checkhelper wrapper that implements exactly the above would make me happy enough.
Yes, that's envisionable. I'll think about it.
if a ./check exists, the framework does the polling for me.
The thing is, the command that does the polling is "sv check". There is currently no equivalent in s6, because the status file, readable via s6-svstat, is supposed to always have an accurate view of the state of the service. I could implement a "s6-svcheck" command that would just run the ./check script a few times, without interacting with the status file or the notification mechanism at all; but that's just a 2-line script loop around ./check, so I never felt there was a point in writing an actual binary doing that. Is that what you want? If it is, I guess I can do it - it's not like it takes time to write.
I think it would be implemented to read the notification-fd file and the (new) timeout-start files and do the Right Thing.
No, a "s6-svcheck" command would mirror "sv check" without interacting with the notification system. (And there's no such thing as timeout-start: there's only timeout-finish, and that's for the ./finish script. :)) On the other hand, a "s6-checkwrapper" command, to be used in the ./run script, would poll ./check at service startup time, in order to wait until the service is ready, and inject the result into the readiness notification system, so dependent programs can use "s6-svwait -uwU" as if the service had native notification abilities. It would stop polling once readiness has been reported. Those are two different functionalities, which one do you want: s6-svcheck? s6-checkwrapper? both?
I'd probably define a default value for notification-fd (3?) but if you want to error out when it doesn't exist and check does exist, that's fine too.
./notification-fd just tells s6-supervise to listen to a readiness notification newline from the run script. It's unrelated to the presence of ./check. If ./check exists and ./notification-fd doesn't, it just means your run script can't use s6-checkwrapper and doesn't provide readiness notification; you can still poll the service by running ./check.
If a service has a ./check script, I'll populate a thisservice-heartbeat sub-service.
Don't automate that: some services may provide a ./check for occasional polling without wanting a heartbeat monitor all day long.
I'll write '3' to notification-fd if it doesn't exist.
Unless you're going to use s6's readiness notification with s6-svwait -uwU or something of the kind, forget about notification-fd.
The thisservice-heartbeat will run ./check at some interval and send notification to notification-fd when it succeeds. (Will sending many multiple up-notifications hurt anything?)
It simply won't work, because ./notification-fd is only the number of a file descriptor made available by s6-supervise for its ./run script. You can't access that descriptor outside of thisservice/run. But again, when you have a watchdog that stays there all the time, forget s6's notification mechanism: just rely on the watchdog's output.
If ./check fails, I want to notify s6 that the service is no longer 'up', and put it into a state where it will be restarted. I'm not sure how I will do that bit.
Anything wrong with "s6-svc -t /service/thisservice" when the heartbeat fails? -- Laurent
