Hi All,
I am using the Openshift 4.2 and trying to delete the daemon set
with 'oc delete ds <ds_name>'  command, but its failing.

We deployed our own container image as a daemonset and stopped the running
application (processes) using preStop hook
which is having systemctl stop <service_name>. This service_name basically
stopping all the application processes
spawned by us that are running inside the container.

But as I mentioned 'oc delete ds <daemonset_name> ' running on the master
node is not killing the pods on worker nodes and hence
the pods are showing in terminating state in the master node forever, but
the pods are actually running in the worker nodes.

I tried manually deleting the pods on worker node using crictl rm
<conatiner id> but it is not deleting the pods.
But when I use runc kill <full_conatiner_id> 37 ( singal 37) on the worker
nodes, its killing the container.


Expected behavior:
'oc delete ds <daemonset> '

should delete the pods on worker nodes.

Any help regarding this highly appreciated.
Looking forward to your reply.

Thanks & Regards,
Ramana

*These are the deatils:*

*OS version:*
[core@compute-2 ~]$ cat /etc/os-release
NAME="Red Hat Enterprise Linux CoreOS"
VERSION="42.81.20191223.0"
VERSION_ID="4.2"
PRETTY_NAME="Red Hat Enterprise Linux CoreOS 42.81.20191223.0 (Ootpa)"
ID="rhcos"
ID_LIKE="rhel fedora"
ANSI_COLOR="0;31"
HOME_URL="https://www.redhat.com/";
BUG_REPORT_URL="https://bugzilla.redhat.com/";
REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
REDHAT_BUGZILLA_PRODUCT_VERSION="4.2"
REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
REDHAT_SUPPORT_PRODUCT_VERSION="4.2"
OSTREE_VERSION=42.81.20191223.0

*crictl version: *
[core@compute-2 ~]$ sudo crictl version
Version:  0.1.0
RuntimeName:  cri-o
RuntimeVersion:  1.14.11-4.dev.rhaos4.2.git179ea6b.el8
RuntimeApiVersion:  v1alpha1

[core@compute-2 ~]$ sudo runc --version
runc version spec: 1.0.1-dev

[core@control-plane-0 ~]$ oc version ( on master node and client version is
same on worker code)
Client Version: v4.2.13
Server Version: 4.2.14
Kubernetes Version: v1.14.6+b294fe5

These are the following logs collected from on master node after running
'oc delete ds <daemonset_name>
[core@control-plane-0 ~]$ kubectl get events --sort-by='{.lastTimestamp}'

72m         Warning   FailedKillPod      pod/test-defaultgroup-fg9zm
error killing pod: [failed to "KillContainer" for "test-defaultgroup" with
KillContainerError: "rpc error: code = Unknown desc = failed to stop
container d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3:
failed to stop container
\"d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3\":
failed to find process: <nil>"
, failed to "KillPodSandbox" for "1398e7ed-44e9-11ea-b014-005056b87475"
with KillPodSandboxError: "rpc error: code = Unknown desc = failed to stop
container
k8s_test-defaultgroup_test-defaultgroup-fg9zm_default_1398e7ed-44e9-11ea-b014-005056b87475_0
in pod sandbox
1fc87253a559df2341c2af4a8d0d746c92fe8722b715ce3cd21bc3b5e82015d7: failed to
stop container
\"d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3\":
failed to find process: <nil>"
]
69m         Warning   FailedKillPod      pod/test-defaultgroup-vqrw4
error killing pod: [failed to "KillContainer" for "test-defaultgroup" with
KillContainerError: "rpc error: code = Unknown desc = failed to stop
container c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073:
failed to stop container
\"c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073\":
failed to find process: <nil>"
, failed to "KillPodSandbox" for "fba29286-4520-11ea-b014-005056b87475"
with KillPodSandboxError: "rpc error: code = Unknown desc = failed to stop
container
k8s_test-defaultgroup_test-defaultgroup-vqrw4_default_fba29286-4520-11ea-b014-005056b87475_0
in pod sandbox
f42ce723a4ea1560a9eb988b086c759feb3450ed28b551b3f768ddf5a4aca889: failed to
stop container
\"c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073\":
failed to find process: <nil>"
]
4m53s       Normal    Killing            pod/test-defaultgroup-fg9zm
Stopping container test-defaultgroup
3m22s       Normal    Killing            pod/test-defaultgroup-zxzz9
Stopping container test-defaultgroup
3m19s       Normal    Killing            pod/test-defaultgroup-vqrw4
Stopping container test-defaultgroup

When I manually try to remove conatiner with crictl command, getting the
following error:
[core@compute-2 ~]$ sudo crictl rm fe3d97bf8a70b
Removing the container "fe3d97bf8a70b" failed: rpc error: code = Unknown
desc = unable to stop container
fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0: failed to
stop container
fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0: failed to
stop container
"fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0": failed
to find process: <nil>
[core@compute-2 ~]$
_______________________________________________
dev mailing list
dev@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

Reply via email to