On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <[email protected]> wrote:
> > Le 07/11/2019 à 11:16, Roy Golan a écrit : > > > > On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <[email protected]> wrote: > >> >> Le 07/11/2019 à 07:18, Roy Golan a écrit : >> >> >> >> On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <[email protected]> wrote: >> >>> >>> Le 05/11/2019 à 21:50, Roy Golan a écrit : >>> >>> >>> >>> On Tue, 5 Nov 2019 at 22:46, Roy Golan <[email protected]> wrote: >>> >>>> >>>> >>>> On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <[email protected]> >>>> wrote: >>>> >>>>> >>>>> Le 05/11/2019 à 18:22, Roy Golan a écrit : >>>>> >>>>> >>>>> >>>>> On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <[email protected]> >>>>> wrote: >>>>> >>>>>> >>>>>> Le 05/11/2019 à 13:54, Roy Golan a écrit : >>>>>> >>>>>> >>>>>> >>>>>> On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I tried openshift-install after compiling but no ovirt provider is >>>>>>> available... So waht do you mean when you say "give a try"? Maybe only >>>>>>> provisionning ovirt with the terraform module? >>>>>>> >>>>>>> [root@vm5 installer]# bin/openshift-install create cluster >>>>>>> ? Platform [Use arrows to move, space to select, type to filter, ? >>>>>>> for more help] >>>>>>> > aws >>>>>>> azure >>>>>>> gcp >>>>>>> openstack >>>>>>> >>>>>>> >>>>>>> >>>>>> Its not merged yet. Please pull this image and work with it as a >>>>>> container >>>>>> quay.io/rgolangh/openshift-installer >>>>>> >>>>>> A little feedback as you asked: >>>>>> >>>>>> [root@openshift-installer ~]# docker run -it 56e5b667100f create >>>>>> cluster >>>>>> ? Platform ovirt >>>>>> ? Enter oVirt's api endpoint URL >>>>>> https://air-dev.v100.abes.fr/ovirt-engine/api >>>>>> ? Enter ovirt-engine username admin@internal >>>>>> ? Enter password ********** >>>>>> ? Pick the oVirt cluster Default >>>>>> ? Pick a VM template centos7.x >>>>>> ? Enter the internal API Virtual IP 10.34.212.200 >>>>>> ? Enter the internal DNS Virtual IP 10.34.212.100 >>>>>> ? Enter the ingress IP 10.34.212.50 >>>>>> ? Base Domain oc4.localdomain >>>>>> ? Cluster Name test >>>>>> ? Pull Secret [? for help] ************************************* >>>>>> INFO Creating infrastructure resources... >>>>>> INFO Waiting up to 30m0s for the Kubernetes API at >>>>>> https://api.test.oc4.localdomain:6443... >>>>>> ERROR Attempted to gather ClusterOperator status after installation >>>>>> failure: listing ClusterOperator objects: Get >>>>>> https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusteroperators: >>>>>> dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no >>>>>> such host >>>>>> INFO Pulling debug logs from the bootstrap machine >>>>>> ERROR Attempted to gather debug logs after installation failure: >>>>>> failed to create SSH client, ensure the proper ssh key is in your keyring >>>>>> or specify with --key: failed to initialize the SSH agent: failed to read >>>>>> directory "/output/.ssh": open /output/.ssh: no such file or directory >>>>>> FATAL Bootstrap failed to complete: waiting for Kubernetes API: >>>>>> context deadline exceeded >>>>>> >>>>>> - 6 vms are successfully created thin dependent from the template >>>>>> >>>>>> >>>>>> - each vm is provisionned by cloud-init >>>>>> - the step "INFO Waiting up to 30m0s for the Kubernetes API at >>>>>> https://api.test.oc4.localdomain:6443..." fails. It seems that >>>>>> the DNS pod is not up at this time. >>>>>> - Right this moment, there is no more visibility on what is done, >>>>>> what goes wrong... what's happening there? supposing a kind of >>>>>> playbook >>>>>> downloading a kind of images... >>>>>> - The" pull secret step" is not clear: we must have a redhat >>>>>> account to https://cloud.redhat.com/openshift/install/ to get a >>>>>> key like >>>>>> - >>>>>> {"auths":{"cloud.openshift.com >>>>>> >>>>>> ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": >>>>>> "[email protected]" <[email protected]>},"quay.io >>>>>> >>>>>> ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": >>>>>> "[email protected]" <[email protected]>},"registry.connect.redhat.com >>>>>> >>>>>> ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": >>>>>> "[email protected]" <[email protected]>},"registry.redhat.io >>>>>> >>>>>> ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": >>>>>> "[email protected]" <[email protected]>}}} >>>>>> >>>>>> >>>>>> Can you tell me if I'm doing wrong? >>>>>> >>>>> >>>>> What is the template you are using? I don't think its RHCOS(Red Hat >>>>> CoreOs) template, it looks like Centos? >>>>> >>>>> Use this gist to import the template >>>>> https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b >>>>> >>>>> Unfortunately, the result is the same with the RHCOS template... >>>>> >>>> >>>> Make sure that: >>>> - the IPs supplied are taken, and belong to the VM network of those >>>> master VMs >>>> - localdomain or local domain suffix shouldn't be used >>>> - your ovirt-engine is version 4.3.7 or master >>>> >>>> I didn't mention that you can provide any domain name, even >>> non-existing. >>> When the bootstrap phase will be done, the instllation will teardown the >>> bootsrap mahchine. >>> At this stage if you are using a non-existing domain you would need to >>> add the DNS Virtual IP >>> you provided to your resolv.conf so the installation could resolve >>> api.$CLUSTER_NAME.$CLUSTER_DOMAIN. >>> >>> Also, you have a log under your $INSTALL_DIR/.openshift_install.log >>> >>> I tried several things with your advices, but I'm still stuck at the >>> https://api.test.oc4.localdomain:6443/version?timeout=32s test >>> >>> with logs: >>> >>> time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the >>> Kubernetes API: the server could not find the requested resource" >>> >>> So it means DNS resolution and network are now good and ignition >>> provisionning is is OK but something goes wrong with the bootstrap vm. >>> >>> Now if I log into the bootstrap vm, I can see a selinux message, but it >>> may be not relevant... >>> >>> SELinux: mount invalid. Same Superblock, different security settings for >>> (dev nqueue, type nqueue). >>> >>> Some other cluewWith journalctl: >>> >>> journalctl -b -f -u bootkube >>> >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: >>> {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying >>> of unary invoker >>> failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc >>> error: code = DeadlineExceeded desc = latest connection error: connection >>> error: desc = \"transport: Error while dialing dial tcp: lookup >>> etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: >>> {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying >>> of unary invoker >>> failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc >>> error: code = DeadlineExceeded desc = latest connection error: connection >>> error: desc = \"transport: Error while dialing dial tcp: lookup >>> etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: >>> {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying >>> of unary invoker >>> failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc >>> error: code = DeadlineExceeded desc = latest connection error: connection >>> error: desc = \"transport: Error while dialing dial tcp: lookup >>> etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: >>> https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit >>> proposal: context deadline exceeded >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: >>> https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit >>> proposal: context deadline exceeded >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: >>> https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit >>> proposal: context deadline exceeded >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster >>> Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 >>> +0000 UTC m=+5.813853296 container died >>> 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= >>> registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, >>> name=etcdctl) >>> Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 >>> +0000 UTC m=+5.910814273 container remove >>> 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= >>> registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, >>> name=etcdctl) >>> Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in >>> 5 seconds... >>> >>> It seems to be again a dns resolution issue. >>> >>> [user1@localhost ~]$ dig api.test.oc4.localdomain +short >>> 10.34.212.201 >>> >>> [user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short >>> nothing >>> >>> >>> So what do you think about that? >>> >>> >>> Key here is the masters - they need to boot, get ignition from the >> bootstrap machine and start publishing their IPs and hostnames. >> >> Connect to a master, check its hostname, check its running or failing >> containers `crictl ps -a` by root user. >> >> You were right: >> # crictl ps -a >> CONTAINER ID >> IMAGE >> CREATED STATE NAME >> ATTEMPT POD ID >> 744cb8e654705 >> e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 >> 4 minutes ago Running discovery >> 75 9462e9a8ca478 >> 912ba9db736c3 >> e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 >> 14 minutes ago Exited discovery >> 74 9462e9a8ca478 >> >> # crictl logs 744cb8e654705 >> E1107 08:10:04.262330 1 run.go:67] error looking up self for >> candidate IP 10.34.212.227: lookup >> _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such >> host >> >> # hostname >> localhost >> >> Conclusion: discovery didn't publish IPs and hostname to coreDNS because >> the master didn't get its name master-0.test.oc4.localdomain during >> provisionning phase. >> >> I changed the master-0 hostname and reinitiates ignition to verify: >> >> # hostnamectl set-hostname master-0.test.oc4.localdomain >> >> # touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot >> >> After reboot is completed, no more exited discovery container: >> >> CONTAINER ID >> IMAGE >> CREATED STATE NAME >> ATTEMPT POD ID >> e701efa8bc583 >> 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f >> 20 seconds ago Running coredns >> 1 cbabc53322ac8 >> 2c7bc6abb5b65 >> d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 >> 20 seconds ago Running mdns-publisher >> 1 6f8914ff9db35 >> b3f619d5afa2c >> 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 >> 21 seconds ago Running haproxy-monitor >> 1 0e5c209496787 >> 07769ce79b032 >> 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 >> 21 seconds ago Running keepalived-monitor >> 1 02cf141d01a29 >> fb20d66b81254 >> e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 >> 21 seconds ago Running discovery >> 77 562f32067e0a7 >> 476b07599260e >> 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 >> 22 seconds ago Running haproxy >> 1 0e5c209496787 >> 26b53050a412b >> 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 >> 22 seconds ago Running keepalived >> 1 02cf141d01a29 >> 30ce48453854b >> 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 >> 22 seconds ago Exited render-config >> 1 cbabc53322ac8 >> ad3ab0ae52077 >> 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 >> 22 seconds ago Exited render-config >> 1 6f8914ff9db35 >> 650d62765e9e1 >> registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e >> 13 hours ago Exited coredns >> 0 2ae0512b3b6ac >> 481969ce49bb9 >> registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e >> 13 hours ago Exited mdns-publisher >> 0 d49754042b792 >> 3594d9d261ca7 >> registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d >> 13 hours ago Exited haproxy-monitor >> 0 3476219058ba8 >> 88b13ec02a5c1 >> 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 >> 13 hours ago Exited keepalived-monitor >> 0 a3e13cf07c04f >> 1ab721b5599ed >> registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 >> 13 hours ago >> >> because DNS registration is OK: >> >> [user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short >> 10.34.212.227 >> >> CONCLUSION: >> >> - none of rhcos vm is correctly provisionned to their targeted >> hostname, so they all stay with localhost. >> >> > What is your engine version? the hostname support for ignition is merged > into 4.3.7 and master > > 4.3.7.1-1.el7 > https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2. Sandro when is 4.7.3.2 is due? I only upgraded engine and not vdsm on hosts, but I suppose hosts are not > important for ignition > Correct. > > >> - Cloud-init syntax for the hostname is ok, but it is not provisioned >> by ignition: >> >> Why not provisionning these hostnames with a json snippet or else? >> >> { >> "ignition": { "version": "2.2.0" }, >> "storage": { >> "files": [{ >> "filesystem": "root", >> "path": "/etc/hostname", >> "mode": 420, >> "contents": { "source": "data:,master-0.test.oc4.localdomain" } >> }] >> }} >> >> >> >> >> >>> >>>> >>>>> >>>>> >>>>> Le 05/11/2019 à 12:24, Roy Golan a écrit : >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I'm interested by installing okd on ovirt with the official >>>>>>>> openshift >>>>>>>> installer (https://github.com/openshift/installer), but ovirt is >>>>>>>> not yet >>>>>>>> supported. >>>>>>>> >>>>>>>> >>>>>>> If you want to give a try and supply feedback I'll be glad. >>>>>>> >>>>>>> >>>>>>>> Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and >>>>>>>> >>>>>>>> https://lists.ovirt.org/archives/list/[email protected]/thread/EF7OQUVTY53GV3A7NVQVUT7UCUYKK5CH/ >>>>>>>> , how ovirt 4.3.7 should integrate openshift installer integration >>>>>>>> with >>>>>>>> terraform? >>>>>>>> >>>>>>>> >>>>>>> Terraform is part of it, yes, It is what we use to spin the first 3 >>>>>>> masters, plus a bootstraping machine. >>>>>>> >>>>>>> -- >>>>>>>> Nathanaël Blanchet >>>>>>>> >>>>>>>> Supervision réseau >>>>>>>> Pôle Infrastrutures Informatiques >>>>>>>> 227 avenue Professeur-Jean-Louis-Viala >>>>>>>> 34193 MONTPELLIER CEDEX 5 >>>>>>>> Tél. 33 (0)4 67 54 84 55 >>>>>>>> Fax 33 (0)4 67 54 84 14 >>>>>>>> [email protected] >>>>>>>> >>>>>>>> -- >>>>>>> Nathanaël Blanchet >>>>>>> >>>>>>> Supervision réseau >>>>>>> Pôle Infrastrutures Informatiques >>>>>>> 227 avenue Professeur-Jean-Louis-Viala >>>>>>> 34193 MONTPELLIER CEDEX 5 >>>>>>> Tél. 33 (0)4 67 54 84 55 >>>>>>> Fax 33 (0)4 67 54 84 [email protected] >>>>>>> >>>>>>> -- >>>>>> Nathanaël Blanchet >>>>>> >>>>>> Supervision réseau >>>>>> Pôle Infrastrutures Informatiques >>>>>> 227 avenue Professeur-Jean-Louis-Viala >>>>>> 34193 MONTPELLIER CEDEX 5 >>>>>> Tél. 33 (0)4 67 54 84 55 >>>>>> Fax 33 (0)4 67 54 84 [email protected] >>>>>> >>>>>> -- >>>>> Nathanaël Blanchet >>>>> >>>>> Supervision réseau >>>>> Pôle Infrastrutures Informatiques >>>>> 227 avenue Professeur-Jean-Louis-Viala >>>>> 34193 MONTPELLIER CEDEX 5 >>>>> Tél. 33 (0)4 67 54 84 55 >>>>> Fax 33 (0)4 67 54 84 [email protected] >>>>> >>>>> -- >>> Nathanaël Blanchet >>> >>> Supervision réseau >>> Pôle Infrastrutures Informatiques >>> 227 avenue Professeur-Jean-Louis-Viala >>> 34193 MONTPELLIER CEDEX 5 >>> Tél. 33 (0)4 67 54 84 55 >>> Fax 33 (0)4 67 54 84 [email protected] >>> >>> -- >> Nathanaël Blanchet >> >> Supervision réseau >> Pôle Infrastrutures Informatiques >> 227 avenue Professeur-Jean-Louis-Viala >> 34193 MONTPELLIER CEDEX 5 >> Tél. 33 (0)4 67 54 84 55 >> Fax 33 (0)4 67 54 84 [email protected] >> >> -- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures Informatiques > 227 avenue Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 [email protected] > >
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/GZ64UU7KYDYJMZW2BSPWVKLX3EXMYMYO/

