Re: Not able to see redistribution of workload after node failures

Just Marvin Fri, 29 May 2020 10:09:14 -0700

Hi,

    Got things working finally. Pod tolerations were the answer. Added this
to the dc, and then things worked right:


                "tolerations": [
                    {
                        "effect": "NoExecute",
                        "key": "node.kubernetes.io/not-ready",
                        "operator": "Exists",
                        "tolerationSeconds": 30
                    },
                    {
                        "effect": "NoExecute",
                        "key": "node.kubernetes.io/unreachable",
                        "operator": "Exists",
                        "tolerationSeconds": 30
                    }
                ]

Regards,
Marvin

On Fri, May 29, 2020 at 12:00 PM Just Marvin <
marvin.the.cynical.ro...@gmail.com> wrote:

> Hi,
>
>      Further experiments have shown that if we keep a node down, after
> about 5 - 6 mins of the node being in the NotReady state, the cluster does
> try to terminate the "running" pod, and stand up a new replacement pod. The
> pod being terminated just sits there in the "Terminating" state forever,
> but the new pod does get into the "Running" state. Which is good. However,
> 5+ mins to get to this is too long for us. Is there some knob to turn which
> will allow us to decrease that period of time?
>
> Thanks,
> Marvin
>
> On Fri, May 29, 2020 at 10:32 AM Just Marvin <
> marvin.the.cynical.ro...@gmail.com> wrote:
>
>> Hi,
>>
>>     My problem has gotten a little worse.....I cordoned off a node,
>> rebooted it, and when the node came back up, I see that the pods have been
>> restarted on the newly rebooted node. This seems like a bug. Any thoughts?
>>
>> Regards,
>> Marvin
>>
>> On Fri, May 29, 2020 at 6:49 AM Just Marvin <
>> marvin.the.cynical.ro...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>     I'm working on a demonstration of OpenShift where we are trying to
>>> showcase its automatic response to various failure events - a big part of
>>> the demo being how it responds to node failure. I've got three worker
>>> nodes, and a reboot of one happens in < 2 mins. I would expect to see the
>>> pods on the failed / rebooted nodes migrate to the the other two nodes.
>>> Instead, what I see is that even when the machine is down, it takes a while
>>> for OpenShift to detect the failure, and at that point, I see this:
>>>
>>> zaphod@oc3027208274 constants]$ oc get pods -l app=ingest-service-poc
>>> NAME                                  READY   STATUS    RESTARTS   AGE
>>> ingest-service-poc-5656c574df-bbq2v   1/1     Running   0          45h
>>> ingest-service-poc-5656c574df-d4t9l   1/1     Running   0          6d15h
>>> ingest-service-poc-5656c574df-gmlm7   1/1     Running   0          44h
>>> ingest-service-poc-5656c574df-j4rgn   0/1     Error     5          18h
>>> ingest-service-poc-5656c574df-kl8vp   1/1     Running   0          45h
>>> ingest-service-poc-5656c574df-mcvtb   1/1     Running   6          45h
>>> ingest-service-poc-5656c574df-rt24l   1/1     Running   0          45h
>>> ingest-service-poc-5656c574df-w99v6   0/1     Error     5          45h
>>> ingest-service-poc-5656c574df-ztl6h   1/1     Running   0          44h
>>>
>>>     It looks like the pods are simply being restarted on the same node
>>> whenever it comes back from the reboot. The pod does not have a liveness or
>>> a readiness probe configured, so this is all driven by events that
>>> kubernetes detects from the node failure. Eventually, the pods are all
>>> restarted and everything is back to the "Running" state.
>>>
>>>     Is this the right behavior for the cluster? Are there timeouts /
>>> frequency of checks for node failure etc that we can tune to show a
>>> redistribution of the pods when a node fails?
>>>
>>>     To add to my problems, on some instances, after the node failure,
>>> this is the stable state:
>>>
>>> NAME                                  READY   STATUS
>>> RESTARTS   AGE
>>> ingest-service-poc-5656c574df-bbq2v   1/1     Running            0
>>>    45h
>>> ingest-service-poc-5656c574df-d4t9l   1/1     Running            0
>>>    6d15h
>>> ingest-service-poc-5656c574df-gmlm7   1/1     Running            0
>>>    45h
>>> ingest-service-poc-5656c574df-j4rgn   0/1     CrashLoopBackOff   13
>>>     18h
>>> ingest-service-poc-5656c574df-kl8vp   1/1     Running            0
>>>    45h
>>> ingest-service-poc-5656c574df-mcvtb   0/1     CrashLoopBackOff   13
>>>     45h
>>> ingest-service-poc-5656c574df-rt24l   1/1     Running            0
>>>    45h
>>> ingest-service-poc-5656c574df-w99v6   0/1     CrashLoopBackOff   11
>>>     45h
>>> ingest-service-poc-5656c574df-ztl6h   1/1     Running            0
>>>    45h
>>>
>>>     Not sure why this happens, and how we can avoid getting here. Some
>>> data to go with this:
>>>
>>> [zaphod@oc3027208274 constants]$ oc get events | grep
>>> ingest-service-poc-5656c574df-j4rgn
>>> 12m         Normal    TaintManagerEviction
>>> pod/ingest-service-poc-5656c574df-j4rgn   Cancelling deletion of Pod
>>> irs-dev/ingest-service-poc-5656c574df-j4rgn
>>> 25m         Normal    SandboxChanged
>>> pod/ingest-service-poc-5656c574df-j4rgn   Pod sandbox changed, it will be
>>> killed and re-created.
>>> 24m         Normal    Pulling
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Pulling image
>>> "image-registry.openshift-image-registry.svc:5000/irs-dev/ingest-service-poc@sha256
>>> :616ffb682e356002d89b8f850d43bb8cdf7db3f11e8e1c1141a03e30a05b7b0d"
>>> 24m         Normal    Pulled
>>> pod/ingest-service-poc-5656c574df-j4rgn   Successfully pulled image
>>> "image-registry.openshift-image-registry.svc:5000/irs-dev/ingest-service-poc@sha256
>>> :616ffb682e356002d89b8f850d43bb8cdf7db3f11e8e1c1141a03e30a05b7b0d"
>>> 24m         Normal    Created
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Created container
>>> ingest-service-poc
>>> 24m         Normal    Started
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Started container
>>> ingest-service-poc
>>> 24m         Warning   BackOff
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Back-off restarting failed
>>> container
>>> 11m         Normal    SandboxChanged
>>> pod/ingest-service-poc-5656c574df-j4rgn   Pod sandbox changed, it will be
>>> killed and re-created.
>>> 11m         Warning   FailedCreatePodSandBox
>>> pod/ingest-service-poc-5656c574df-j4rgn   Failed create pod sandbox: rpc
>>> error: code = Unknown desc = failed to create pod network sandbox
>>> k8s_ingest-service-poc-5656c574df-j4rgn_irs-dev_28e2831a-a5c6-4ada-9502-71bf88b06a96_5(956d00ab6ccfcbc7397498f3145f5df5872bce3edaed6f813d992152984d5516):
>>> Multus: [irs-dev/ingest-service-poc-5656c574df-j4rgn]: error adding
>>> container to network "k8s-pod-network": delegateAdd: error invoking
>>> conflistAdd - "k8s-pod-network": conflistAdd: error in getting result from
>>> AddNetworkList: error adding host side routes for interface:
>>> calic2f571d975e, error: route (Ifindex: 41, Dst: 172.30.75.22/32,
>>> Scope: 253) already exists for an interface other than 'calic2f571d975e'
>>> 6m52s       Normal    Pulling
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Pulling image
>>> "image-registry.openshift-image-registry.svc:5000/irs-dev/ingest-service-poc@sha256
>>> :616ffb682e356002d89b8f850d43bb8cdf7db3f11e8e1c1141a03e30a05b7b0d"
>>> 8m51s       Normal    Pulled
>>> pod/ingest-service-poc-5656c574df-j4rgn   Successfully pulled image
>>> "image-registry.openshift-image-registry.svc:5000/irs-dev/ingest-service-poc@sha256
>>> :616ffb682e356002d89b8f850d43bb8cdf7db3f11e8e1c1141a03e30a05b7b0d"
>>> 10m         Normal    Created
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Created container
>>> ingest-service-poc
>>> 10m         Normal    Started
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Started container
>>> ingest-service-poc
>>> 117s        Warning   BackOff
>>>  pod/ingest-service-poc-5656c574df-j4rgn   Back-off restarting failed
>>> container
>>> [zaphod@oc3027208274 constants]$
>>>
>>>     Any advice on how to avoid getting to this state would be
>>> appreciated.
>>>
>>> Regards,
>>> Marvin
>>>
>>

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Not able to see redistribution of workload after node failures

Reply via email to