4.2 backports merged, it will be included in tomorrow's build On Tue, May 29, 2018 at 10:29 PM, Martin Perina <[email protected]> wrote:
> Master revert patches [1], [2] merged, 4.2 revert patches [3], [4] waiting > to be merged. > > We will repost patches to master tomorrow and will continue to investigate > mysterious host-deploy issue. > > Btw, upgrade-from-prev-release on master [5] currently fails with: > > 18:59:31 + cp > 'ovirt-system-tests/upgrade-from-prevrelease-suite-master/*.repo' > exported-artifacts > 18:59:31 cp: cannot stat 'ovirt-system-tests/upgrade- > from-prevrelease-suite-master/*.repo': No such file or directory > 18:59:31 POST BUILD TASK : FAILURE > > So how can we test upgrade from 4.2 to master? > > Martin > > > [1] https://gerrit.ovirt.org/91741 > [2] https://gerrit.ovirt.org/91742 > [3] https://gerrit.ovirt.org/91744 > [4] https://gerrit.ovirt.org/91745 > [5] https://jenkins.ovirt.org/view/oVirt%20system%20tests/ > job/ovirt-system-tests_manual/2758/console > > > On Tue, May 29, 2018 at 3:42 PM, Barak Korren <[email protected]> wrote: > >> >> >> On 29 May 2018 at 16:30, Martin Perina <[email protected]> wrote: >> >>> >>> >>> On Tue, May 29, 2018 at 3:12 PM, Dafna Ron <[email protected]> wrote: >>> >>>> Martin, do you have any updates? please note that ovirt-engine has been >>>> broken for a few days so perhaps we should stop merging or revert the >>>> original change? >>>> >>> >>> Still looking at it, here are partial results: >>> >>> 1. New host installation: never reproduced, 4.2 host is always installed >>> fine on 4.2 engine >>> 2. Upgrade - never reproduced, upgrade of both 4.1 engine and host to >>> 4.2 was always successfull >>> 3. Reinstallation - once it happened to me that during reinstallation >>> the host remain stucked during Reinstallation and the whole reinstallation >>> failed due to timeout >>> - that may be the issue which can be seen in CI, but so far I don't >>> have reliable reproducer to be able to debug why host-deploy process on the >>> host is stucked >>> >> >> Did you try using OST locally? it reproduces consistently with the OST >> upgrade suit. You can also use the manual job and pass a URL to any engine >> build beyond the marked patch. But there you'll have the same issue as with >> the CQ job where you won't have logs... >> >> Note, the process that happens there is AFAIK: >> 1. The oVirt 4.1 release is installed. >> 2. engine-setup runs >> 3. repos are changed to the master repo >> 4. engine is upgraded >> 5. bootstrap (including AddHost that fails is carried out) >> >> >>> >>> >>>> >>>> On Tue, May 29, 2018 at 1:26 PM, Piotr Kliczewski <[email protected]> >>>> wrote: >>>> >>>>> +Martin >>>>> >>>>> He is working on it. >>>>> >>>>> Thanks, >>>>> Piotr >>>>> >>>>> On Tue, May 29, 2018 at 2:22 PM, Dafna Ron <[email protected]> wrote: >>>>> >>>>>> Hi Piotr, >>>>>> >>>>>> Any update on this? >>>>>> >>>>>> Thanks. >>>>>> Dafna >>>>>> >>>>>> >>>>>> On Mon, May 28, 2018 at 10:59 AM, Piotr Kliczewski < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> On Mon, May 28, 2018 at 11:41 AM, Barak Korren <[email protected]> >>>>>>> wrote: >>>>>>> > >>>>>>> > >>>>>>> > On 28 May 2018 at 12:38, Piotr Kliczewski < >>>>>>> [email protected]> >>>>>>> > wrote: >>>>>>> >> >>>>>>> >> On Mon, May 28, 2018 at 10:57 AM, Barak Korren < >>>>>>> [email protected]> wrote: >>>>>>> >> > Note: we're now seeing a very similar issue in the 4.2 branch >>>>>>> as well >>>>>>> >> > that >>>>>>> >> > seems to have been introduced by the following patch: >>>>>>> >> >>>>>>> >> Can you point to specific job so we could take a look at the logs? >>>>>>> > >>>>>>> > >>>>>>> > Whoops, sorry, here: >>>>>>> > http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/2034/ >>>>>>> > >>>>>>> >>>>>>> Looks like the same issue: >>>>>>> >>>>>>> 2018-05-28 03:41:03,606-04 ERROR >>>>>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog] >>>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [1244c90f] SSH error >>>>>>> running >>>>>>> command root@lago-upgrade-from-prevrelease-suite-4-2-host-0:'umask >>>>>>> 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t >>>>>>> ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null >>>>>>> 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar >>>>>>> --warning=no-timestamp -C "${MYTMP}" -x && >>>>>>> "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine >>>>>>> DIALOG/customization=bool:True': TimeLimitExceededException: SSH >>>>>>> session timeout host >>>>>>> 'root@lago-upgrade-from-prevrelease-suite-4-2-host-0' >>>>>>> 2018-05-28 03:41:03,606-04 ERROR >>>>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) >>>>>>> [1244c90f] Error during deploy dialog >>>>>>> 2018-05-28 03:41:03,611-04 ERROR >>>>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] >>>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [1244c90f] Timeout during >>>>>>> host lago-upgrade-from-prevrelease-suite-4-2-host-0 install: SSH >>>>>>> session timeout host >>>>>>> 'root@lago-upgrade-from-prevrelease-suite-4-2-host-0' >>>>>>> >>>>>>> >> >>>>>>> >> >>>>>>> >> > >>>>>>> >> > https://gerrit.ovirt.org/c/91638/2 - core: Enable only strong >>>>>>> ciphers >>>>>>> >> > for >>>>>>> >> > 4.2 hosts >>>>>>> >> > >>>>>>> >> > On 28 May 2018 at 10:26, Barak Korren <[email protected]> >>>>>>> wrote: >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >> On 28 May 2018 at 10:19, Martin Perina <[email protected]> >>>>>>> wrote: >>>>>>> >> >>> >>>>>>> >> >>> >>>>>>> >> >>> >>>>>>> >> >>> On Mon, May 28, 2018 at 9:00 AM, Piotr Kliczewski >>>>>>> >> >>> <[email protected]> >>>>>>> >> >>> wrote: >>>>>>> >> >>>> >>>>>>> >> >>>> Simone, >>>>>>> >> >>>> >>>>>>> >> >>>> What do you think about this failure? >>>>>>> >> >>>> >>>>>>> >> >>>> Thanks, >>>>>>> >> >>>> Piotr >>>>>>> >> >>>> >>>>>>> >> >>>> On Mon, May 28, 2018 at 7:12 AM, Barak Korren < >>>>>>> [email protected]> >>>>>>> >> >>>> wrote: >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> On 27 May 2018 at 14:59, Piotr Kliczewski < >>>>>>> [email protected]> >>>>>>> >> >>>>> wrote: >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> Martin, >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> I only can see: >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> 2018-05-25 13:57:44,255-04 ERROR >>>>>>> >> >>>>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog] >>>>>>> >> >>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [55a7b15b] SSH >>>>>>> error >>>>>>> >> >>>>>> running >>>>>>> >> >>>>>> command root@lago-upgrade-from-release >>>>>>> -suite-master-host-0:'umask >>>>>>> >> >>>>>> 0077; >>>>>>> >> >>>>>> MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t >>>>>>> ovirt-XXXXXXXXXX)"; >>>>>>> >> >>>>>> trap >>>>>>> >> >>>>>> "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr >>>>>>> \"${MYTMP}\" >>>>>>> >> >>>>>> > >>>>>>> >> >>>>>> /dev/null 2>&1" 0; tar --warning=no-timestamp -C >>>>>>> "${MYTMP}" -x && >>>>>>> >> >>>>>> "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine >>>>>>> >> >>>>>> DIALOG/customization=bool:True': >>>>>>> TimeLimitExceededException: SSH >>>>>>> >> >>>>>> session >>>>>>> >> >>>>>> timeout host 'root@lago-upgrade-from-releas >>>>>>> e-suite-master-host-0' >>>>>>> >> >>>>>> 2018-05-25 13:57:44,259-04 ERROR >>>>>>> >> >>>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] >>>>>>> >> >>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [55a7b15b] >>>>>>> Timeout during >>>>>>> >> >>>>>> host >>>>>>> >> >>>>>> lago-upgrade-from-release-suite-master-host-0 install: >>>>>>> SSH session >>>>>>> >> >>>>>> timeout >>>>>>> >> >>>>>> host 'root@lago-upgrade-from-release-suite-master-host-0' >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> There are no additional logs. SSH to host timeout. Are we >>>>>>> sure that >>>>>>> >> >>>>>> it >>>>>>> >> >>>>>> is an issue caused by Ravi's change? >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> We have some quite strong circumstantial evidence: >>>>>>> >> >>>>> - Issue had affected all engine patches since that patch in >>>>>>> a >>>>>>> >> >>>>> similar >>>>>>> >> >>>>> fashion. >>>>>>> >> >>>>> - Prior engine patch [1] passed successfully [2] >>>>>>> >> >>>>> - Other subsequent OST runs without engine patches passed >>>>>>> >> >>>>> successfully >>>>>>> >> >>>>> as well [3]. >>>>>>> >> >>>>> >>>>>>> >> >>>>> [1]: https://gerrit.ovirt.org/c/91595/2 >>>>>>> >> >>>>> [2]: >>>>>>> >> >>>>> http://jenkins.ovirt.org/job/o >>>>>>> virt-master_change-queue-tester/7777/ >>>>>>> >> >>>>> [3]: >>>>>>> >> >>>>> http://jenkins.ovirt.org/job/o >>>>>>> virt-master_change-queue-tester/7778/ >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> Please note - the issue is affecting a test that is run by >>>>>>> an >>>>>>> >> >>>>> upgrade >>>>>>> >> >>>>> suit on the post-upgrade system. It has no affect on the >>>>>>> basic suit. >>>>>>> >> >>>>> So it >>>>>>> >> >>>>> probably has to do with some behaviour that is specific to >>>>>>> upgraded >>>>>>> >> >>>>> systems. >>>>>>> >> >>> >>>>>>> >> >>> >>>>>>> >> >>> I will try to reproduce later today in dev env, but I agree >>>>>>> with >>>>>>> >> >>> Piotr's >>>>>>> >> >>> investigation, engine was not able to connect to the host >>>>>>> using SSH >>>>>>> >> >>> and >>>>>>> >> >>> that's why no host-deploy logs were fetched. >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >> Lago fetches the logs from the host too (And it can take then >>>>>>> from the >>>>>>> >> >> VM >>>>>>> >> >> image directly if the host is not responsive over SSH), can we >>>>>>> get at >>>>>>> >> >> the >>>>>>> >> >> host-deploy logs that way? >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> Thanks, >>>>>>> >> >>>>>> Piotr >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> On Sun, May 27, 2018 at 11:21 AM, Martin Perina >>>>>>> >> >>>>>> <[email protected]> >>>>>>> >> >>>>>> wrote: >>>>>>> >> >>>>>>> >>>>>>> >> >>>>>>> Adding also Piotr to the thread >>>>>>> >> >>>>>>> >>>>>>> >> >>>>>>> >>>>>>> >> >>>>>>> On Sun, 27 May 2018, 08:46 Barak Korren, < >>>>>>> [email protected]> >>>>>>> >> >>>>>>> wrote: >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> Test failed: [ AddHost (in upgrade-from-release-suite) ] >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> Link to suspected patches: >>>>>>> >> >>>>>>>> https://gerrit.ovirt.org/#/c/91445/5 - Disable TLS >>>>>>> versions < 1.2 >>>>>>> >> >>>>>>>> for hosts with cluster level>=4.1 >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> Link to Job: >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> http://jenkins.ovirt.org/job/o >>>>>>> virt-master_change-queue-tester/7776/ >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> Link to all logs: >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> http://jenkins.ovirt.org/job/o >>>>>>> virt-master_change-queue-tester/7776/artifact/exported-artif >>>>>>> acts/upgrade-from-release-suit-master-el7/test_logs/upgrade- >>>>>>> from-release-suite-master/post-002_bootstrap.py/ >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> Error snippet from log: >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> From nosetst log: >>>>>>> >> >>>>>>>> <error> >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> AssertionError: False != True after 1200 seconds >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> </error> >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> Not finding a host deploy log in /var/log/ovirt-engine >>>>>>> for some >>>>>>> >> >>>>>>>> reason. >>>>>>> >> >>>>>>>> This seems to have cause consistent failure in all other >>>>>>> engine >>>>>>> >> >>>>>>>> patches that followed it. >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> >>>>>>> >> >>>>>>>> -- >>>>>>> >> >>>>>>>> Barak Korren >>>>>>> >> >>>>>>>> RHV DevOps team , RHCE, RHCi >>>>>>> >> >>>>>>>> Red Hat EMEA >>>>>>> >> >>>>>>>> redhat.com | TRIED. TESTED. TRUSTED. | >>>>>>> redhat.com/trusted >>>>>>> >> >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> >>>>>>> >> >>>>> -- >>>>>>> >> >>>>> Barak Korren >>>>>>> >> >>>>> RHV DevOps team , RHCE, RHCi >>>>>>> >> >>>>> Red Hat EMEA >>>>>>> >> >>>>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted >>>>>>> >> >>>> >>>>>>> >> >>>> >>>>>>> >> >>> >>>>>>> >> >>> >>>>>>> >> >>> >>>>>>> >> >>> -- >>>>>>> >> >>> Martin Perina >>>>>>> >> >>> Associate Manager, Software Engineering >>>>>>> >> >>> Red Hat Czech s.r.o. >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >> -- >>>>>>> >> >> Barak Korren >>>>>>> >> >> RHV DevOps team , RHCE, RHCi >>>>>>> >> >> Red Hat EMEA >>>>>>> >> >> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted >>>>>>> >> > >>>>>>> >> > >>>>>>> >> > >>>>>>> >> > >>>>>>> >> > -- >>>>>>> >> > Barak Korren >>>>>>> >> > RHV DevOps team , RHCE, RHCi >>>>>>> >> > Red Hat EMEA >>>>>>> >> > redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted >>>>>>> >> > >>>>>>> >> > _______________________________________________ >>>>>>> >> > Devel mailing list -- [email protected] >>>>>>> >> > To unsubscribe send an email to [email protected] >>>>>>> >> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>>> >> > oVirt Code of Conduct: >>>>>>> >> > https://www.ovirt.org/community/about/community-guidelines/ >>>>>>> >> > List Archives: >>>>>>> >> > >>>>>>> >> > https://lists.ovirt.org/archives/list/[email protected]/messag >>>>>>> e/QIZ5L4FKII7X5FHQ4OXBBR2SLUIK5C74/ >>>>>>> >> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > -- >>>>>>> > Barak Korren >>>>>>> > RHV DevOps team , RHCE, RHCi >>>>>>> > Red Hat EMEA >>>>>>> > redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted >>>>>>> _______________________________________________ >>>>>>> Devel mailing list -- [email protected] >>>>>>> To unsubscribe send an email to [email protected] >>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>>> oVirt Code of Conduct: https://www.ovirt.org/communit >>>>>>> y/about/community-guidelines/ >>>>>>> List Archives: https://lists.ovirt.org/archiv >>>>>>> es/list/[email protected]/message/RDK42TYJKMX3M2DNUFKZO7CGNNOYWMJI/ >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> Martin Perina >>> Associate Manager, Software Engineering >>> Red Hat Czech s.r.o. >>> >> >> >> >> -- >> Barak Korren >> RHV DevOps team , RHCE, RHCi >> Red Hat EMEA >> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted >> > > > > -- > Martin Perina > Associate Manager, Software Engineering > Red Hat Czech s.r.o. >
_______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/HZA4JYWLBO3QJBNPE3VDBSWGKJPXQSAV/
