2022-05-03 18:30:41 UTC - Brendan Doyle: Has anyone ever faced an issue where 
some activations can't init after a vm reboot? Here's the order of behavior

1. Vm reboot, action containers are wiped beforehand when the invoker is stopped
2. Invoker is started
3. Healthcheck activations start
4. It seems like the first healthcheck activations face the issue and then the 
second set of healthcheck activations succeed. The init call to the function 
container times out after one minute with a connection refused response
5. Real traffic starts being sent but some still fail from this issue for about 
2-3 minutes.
6. It resolves on its own after a couple minutes and can only be reproduced 
with a vm reboot and always happens on first run of the invoker after a vm 
reboot no matter how long you wait. Restarting the invoker after hitting the 
issue guarantees it won't happen again. (a simple daemon reboot does not 
reproduce either, it has to be a vm restart)
I don't suspect it's an openwhisk bug, it seems like it's something to do with 
docker so I don't expect much help here but curious if anyone's seen this 
before. We do pull all of our runtime containers to the machine when the 
invoker is started prior to accepting traffic so I know that is not the issue. 
We're also on the latest version of docker engine which I also know isn't 
technically supported. But curious if anyone has any knowledge of something 
related to docker that gets wiped on reboot that I should look into
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1651602641850569?thread_ts=1651602641.850569&cid=C3TPCAQG1
----

Reply via email to