Hi, This is a kubernetes undercloud issue – not specific to onap I regularly restart my VMware VM’s – especially when duplicating VMs across laptops – in that case there are 3 scenarios – I would like to cover off it you are experiencing #3 – IP change – that will render your cluster unusable until the .kube/config IP is modified to match – if this is the issue?
1. OK: I shutdown and restart the original VM to change the # of vCores – this VM starts up within a couple min with all pods up – sometimes with ONAP pods running – sometimes without – all 1/1, 2/2, 3/3 up 2. IP: I do above but move or copy the VM – this renders the IP for the VM different – which requires a change in ~/.kube/config – these startup fine as well 3. Upgrade: when I upgrade the system from say k8s 1.10 to 1.11 – in this case everything goes and I do a docker stop/rm on the rancher server/client and reinstall – no need in this case for a full clean - https://wiki.onap.org/display/DW/ONAP+Development#ONAPDevelopment-RemoveaDeployment 4. OK: on public cloud I register an elastic IP and create a route53 dns record – if my spot VM restarts with a different IP – no problem I have re-attached the persistent EIP that rancher was originally registered to. Could you post details of which pods are having issues – particularly if the kubernetes pods are up first – if your cluster IP changes – it will be more evident. Thank you /michael From: [email protected] <[email protected]> On Behalf Of Syed Atif Husain Sent: Tuesday, November 27, 2018 12:03 AM To: [email protected]; [email protected] Subject: Re: [onap-discuss] Has anyone had success in restarting the #kubernetes cluster after a power outage with an #OOM #Beijing ONAP I have faced the same issue. But haven’t found a solution so far except for reinstalling ONAP. Regards, Atif From: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> On Behalf Of [email protected]<mailto:[email protected]> Sent: Tuesday, November 27, 2018 12:10 AM To: [email protected]<mailto:[email protected]> Subject: [onap-discuss] Has anyone had success in restarting the #kubernetes cluster after a power outage with an #OOM #Beijing ONAP After a power outage about 75% of the pods come back. And for the most part the functionality is not working. Seeing a bunch of errors for pods that look like this: container "portal-db-job" in pod "onap-portal-db-config-n6lrn" is waiting to start: PodInitializing This email and the information contained herein is proprietary and confidential and subject to the Amdocs Email Terms of Service, which you may review at https://www.amdocs.com/about/email-terms-of-service <https://www.amdocs.com/about/email-terms-of-service> -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14086): https://lists.onap.org/g/onap-discuss/message/14086 Mute This Topic: https://lists.onap.org/mt/28324774/21656 Mute #oom: https://lists.onap.org/mk?hashtag=oom&subid=2740164 Mute #kubernetes: https://lists.onap.org/mk?hashtag=kubernetes&subid=2740164 Mute #beijing: https://lists.onap.org/mk?hashtag=beijing&subid=2740164 Group Owner: [email protected] Unsubscribe: https://lists.onap.org/g/onap-discuss/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
