Hi All, I am using the Openshift 4.2 and trying to delete the daemon set with 'oc delete ds <ds_name>' command, but its failing.
We deployed our own container image as a daemonset and stopped the running application (processes) using preStop hook which is having systemctl stop <service_name>. This service_name basically stopping all the application processes spawned by us that are running inside the container. But as I mentioned 'oc delete ds <daemonset_name> ' running on the master node is not killing the pods on worker nodes and hence the pods are showing in terminating state in the master node forever, but the pods are actually running in the worker nodes. I tried manually deleting the pods on worker node using crictl rm <conatiner id> but it is not deleting the pods. But when I use runc kill <full_conatiner_id> 37 ( singal 37) on the worker nodes, its killing the container. Expected behavior: 'oc delete ds <daemonset> ' should delete the pods on worker nodes. Any help regarding this highly appreciated. Looking forward to your reply. Thanks & Regards, Ramana *These are the deatils:* *OS version:* [core@compute-2 ~]$ cat /etc/os-release NAME="Red Hat Enterprise Linux CoreOS" VERSION="42.81.20191223.0" VERSION_ID="4.2" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 42.81.20191223.0 (Ootpa)" ID="rhcos" ID_LIKE="rhel fedora" ANSI_COLOR="0;31" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.2" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.2" OSTREE_VERSION=42.81.20191223.0 *crictl version: * [core@compute-2 ~]$ sudo crictl version Version: 0.1.0 RuntimeName: cri-o RuntimeVersion: 1.14.11-4.dev.rhaos4.2.git179ea6b.el8 RuntimeApiVersion: v1alpha1 [core@compute-2 ~]$ sudo runc --version runc version spec: 1.0.1-dev [core@control-plane-0 ~]$ oc version ( on master node and client version is same on worker code) Client Version: v4.2.13 Server Version: 4.2.14 Kubernetes Version: v1.14.6+b294fe5 These are the following logs collected from on master node after running 'oc delete ds <daemonset_name> [core@control-plane-0 ~]$ kubectl get events --sort-by='{.lastTimestamp}' 72m Warning FailedKillPod pod/test-defaultgroup-fg9zm error killing pod: [failed to "KillContainer" for "test-defaultgroup" with KillContainerError: "rpc error: code = Unknown desc = failed to stop container d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3: failed to stop container \"d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3\": failed to find process: <nil>" , failed to "KillPodSandbox" for "1398e7ed-44e9-11ea-b014-005056b87475" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to stop container k8s_test-defaultgroup_test-defaultgroup-fg9zm_default_1398e7ed-44e9-11ea-b014-005056b87475_0 in pod sandbox 1fc87253a559df2341c2af4a8d0d746c92fe8722b715ce3cd21bc3b5e82015d7: failed to stop container \"d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3\": failed to find process: <nil>" ] 69m Warning FailedKillPod pod/test-defaultgroup-vqrw4 error killing pod: [failed to "KillContainer" for "test-defaultgroup" with KillContainerError: "rpc error: code = Unknown desc = failed to stop container c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073: failed to stop container \"c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073\": failed to find process: <nil>" , failed to "KillPodSandbox" for "fba29286-4520-11ea-b014-005056b87475" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to stop container k8s_test-defaultgroup_test-defaultgroup-vqrw4_default_fba29286-4520-11ea-b014-005056b87475_0 in pod sandbox f42ce723a4ea1560a9eb988b086c759feb3450ed28b551b3f768ddf5a4aca889: failed to stop container \"c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073\": failed to find process: <nil>" ] 4m53s Normal Killing pod/test-defaultgroup-fg9zm Stopping container test-defaultgroup 3m22s Normal Killing pod/test-defaultgroup-zxzz9 Stopping container test-defaultgroup 3m19s Normal Killing pod/test-defaultgroup-vqrw4 Stopping container test-defaultgroup When I manually try to remove conatiner with crictl command, getting the following error: [core@compute-2 ~]$ sudo crictl rm fe3d97bf8a70b Removing the container "fe3d97bf8a70b" failed: rpc error: code = Unknown desc = unable to stop container fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0: failed to stop container fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0: failed to stop container "fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0": failed to find process: <nil> [core@compute-2 ~]$
_______________________________________________ dev mailing list dev@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/dev