Itohan, Hi, the k8s cluster is sensitive to network lag or timeouts - we are working this out if the out of the box k8s config is sufficient for all cloud-native deployments. On some systems the cluster may degrade over a couple days to a month - I have systems with 1500+ k8s container restarts that are ok after 3 weeks but on some clouds the cluster degrades faster - again the tool needs to be optimized fully.
How many servers in your cluster? I ask this because you either can run 1 co-located master/cluster or 3+ nodes for etcd stability - unless you need to test actual failover - likely you are running multiple hosts or you would need the -max-pods=500 additional kubelet commands fix to go over 110. Is your cluster saturated - try to run with under 70% ram allocation - for example onap currently needs 160g - so you would need 220+G for a system that can reschedule pods to the remaining nodes based on a node failure. I would recommend several kubectl commands to triage your system - but if tiller is down then you won't be able to do these Start with your ~/.kube/config If you are using root as well as your ubuntu? Non-root user - you would need to add these in both of your profiles on the master. Before this though - verify that the correct IP is being used when you registered your system either using the script or the "add host" button in the rancher gui. Another thing - did the IP of your VM change (try to use static/EIPs for the master - or register a domain name against it. - if the IP changes then the .kube/config will need to be upgraded. A third thing - did you wait until the rancher server displayed a token instead of user/pass in the add-hosts - this is a bug in 1.6 which is fixed by waiting a couple minutes after rancher_server docker startup - like the script does Worst case - just re-add your host(s) by regenerating the docker command from the gui - like the script https://git.onap.org/logging-analytics/tree/deploy/rancher/oom_rancher_setup.sh#n162 Did you run one of the scripts like the following to get your system up and hosts attached? Once you added hosts above https://git.onap.org/logging-analytics/tree/deploy/rancher/oom_rancher_setup.sh#n209 I ask this - just to verify you upgraded the server side of tiller sudo helm init --upgrade When tiller gets back up.... Do a list of nodes - for the k8s system you should see https://wiki.onap.org/display/DW/Cloud+Native+Deployment#CloudNativeDeployment-Scriptedundercloud(Helm/Kubernetes/Docker)andONAPinstall-SingleVM-256G Something like kubectl get pods --all-namespaces kube-system heapster-76b8cd7b5-g7p6n 1/1 Running 0 8m kube-system kube-dns-5d7b4487c9-jjgvg 3/3 Running 0 8m kube-system kubernetes-dashboard-f9577fffd-qldrw 1/1 Running 0 8m kube-system monitoring-grafana-997796fcf-g6tr7 1/1 Running 0 8m kube-system monitoring-influxdb-56fdcd96b-x2kvd 1/1 Running 0 8m kube-system tiller-deploy-54bcc55dd5-756gn 1/1 Running 0 2m Check your nodes Kubectl top nodes From: onap-discuss@lists.onap.org <onap-discuss@lists.onap.org> On Behalf Of Ukponmwan, Itohan Sent: Wednesday, October 24, 2018 5:29 PM To: onap-discuss@lists.onap.org Subject: [onap-discuss] Issues with Tiller (Cannot Connect to Tiller) Hi All, I have OOM ONAP deployed. It was working but now I have issues with connecting to tiller using helm or kubectl commands. When I do a helm version, I see the following; Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"} Error: cannot connect to Tiller Does anyone know how to fix this error? It is blocking my HPA integration testing. Thanks, Itohan “Amdocs’ email platform is based on a third-party, worldwide, cloud-based system. Any emails sent to Amdocs will be processed and stored using such system and are accessible by third party providers of such system on a limited basis. Your sending of emails to Amdocs evidences your consent to the use of such system and such processing, storing and access”. -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#13280): https://lists.onap.org/g/onap-discuss/message/13280 Mute This Topic: https://lists.onap.org/mt/27623611/21656 Group Owner: onap-discuss+ow...@lists.onap.org Unsubscribe: https://lists.onap.org/g/onap-discuss/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-