On Thu, 21 Nov 2019 at 10:58, Clayton Coleman <ccole...@redhat.com> wrote:
> > > On Nov 17, 2019, at 9:34 PM, Joel Pearson <japear...@agiledigital.com.au> > wrote: > > So, I'm running OpenShift 4.2 on Azure UPI following this blog article: > https://blog.openshift.com/openshift-4-1-upi-environment-deployment-on-microsoft-azure-cloud/ > with > a few customisations on the terraform side. > > One of the main differences it seems, is how the router/ingress is > handled. Normal Azure uses load balancers, but UPI Azure uses a regular > router (that I'm used to seeing the 3.x version) which is configured by > setting the "HostNetwork" for the endpoint publishing strategy > <https://github.com/JuozasA/ocp4-azure-upi/blob/master/ingresscontroller-default.yaml#L9-L10> > > > This sounds like a bug in Azure UPI. IPI is the reference architecture, > it shouldn’t have a default divergent from the ref arch. > In the blog, he mentions that he has changed the architecture because it creates a public facing load balancer. In my case I'm not allowed to create a public load balancer at all, additionally I can't use Azure's Public or Private DNS either, so I had to customise the terraform templates even more. Maybe supported UPI Azure will allow internally facing load balancers? > > > It was all working fine in OpenShift 4.2.0 and 4.2.2, but when I upgraded > to OpenShift 4.2.4, the router stopped listening on ports 80 and 443, I > could see the pod running with "crictl ps", but a "netstat -tpln" didn't > show anything listening. > > I tried updating the version back from 4.2.4 to 4.2.2, but I > accidentally used 4.1.22 image digest value, so I quickly reverted back to > 4.2.4 once I saw the apiservers coming up as 4.1.22. I then noticed that > there was a 4.2.7 release on the candidate-4.2 channel, so I switched to > that, and ingress started working properly again. > > So my question is, what is the strategy for recovering from a failed > update? Do I need to have etcd backups and then restore the cluster by > restoring etcd? Ie. > https://docs.openshift.com/container-platform/4.2/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html > > The upgrade page > <https://docs.openshift.com/container-platform/4.2/updating/updating-cluster-between-minor.html> > specifically says "Reverting your cluster to a previous version, or a > rollback, is not supported. Only upgrading to a newer version is > supported." so is it an expectation for a production cluster that you would > restore from backup if the cluster isn't usable? > > > Backup, yes. If you could open a bug for the documentation that would be > great. > Thanks, raised it here: https://bugzilla.redhat.com/show_bug.cgi?id=1777155 > > > Maybe the upgrade page should mention taking backups? Especially if there > is no rollback option. > > _______________________________________________ > users mailing list > users@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/users > >
_______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users