host-deploy is still broken on master fc28 On Mon, Sep 17, 2018 at 8:01 AM, Yuval Turgeman <[email protected]> wrote:
> I'm pretty sure I verified this on el7 as well, i'll check again, but > thinking about it, tar will stop when it gets to the first empty block, so > if the record size on the engine's side is large and the end is filled with > zeros, -b1 will make it stop at the first empty block so the next read on > the host's side would get the trailing zeros which is what otopi reads. > Btw, it could be a problem with deployed el7 systems as well, if for any > reason the default on the host is set to something that is more than 20 > blocks (can be set with export TAR_BLOCKING_FACTOR for the root account on > the host side). > It's ok to revert the patch to fix the regression, but I don't see any > other way other than -b1... perhaps add a `cat -` after to just read until > EOF or something, or have otopi strip the input. > > On Mon, Sep 17, 2018 at 2:30 PM, Galit Rosenthal <[email protected]> > wrote: > >> Didi, >> >> Is this what you are looking for >> https://ovirt-jira.atlassian.net/browse/OVIRT-2259 >> ? >> Galit >> >> On Mon, Sep 17, 2018 at 1:54 PM Dafna Ron <[email protected]> wrote: >> >>> I think that in ovirt-engine we currently only build to centos. >>> since we have not had an engine build for 2 weeks (on master) I think we >>> should merge and worry about fc28 once it would be relevant. >>> >>> the failure we have now could be another regression missed since the >>> project has been broken for two weeks. >>> >>> Thanks, >>> Dafna >>> >>> >>> >>> On Mon, Sep 17, 2018 at 10:30 AM Yedidyah Bar David <[email protected]> >>> wrote: >>> >>>> On Mon, Sep 17, 2018 at 11:49 AM Dafna Ron <[email protected]> wrote: >>>> > >>>> > Didi, Marin, any update on the patch? >>>> >>>> Yes - it passed. Actually failed, but only after host-deploy: >>>> >>>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ >>>> ovirt-system-tests_manual/3189/ >>>> >>>> I'd rather not merge it as-is, because it will break fedora. >>>> >>>> If someone can have a look at the code generating the tar file, and can >>>> see if >>>> it's easy to make it work well for both centos and fedora, perhaps by >>>> explicitly >>>> setting all relevant params to some reasonable values, great. >>>> Otherwise, I guess >>>> we can merge for now, as fedora is still not supported anyway. >>>> >>>> Thanks, >>>> >>>> > >>>> > >>>> > On Sun, Sep 16, 2018 at 11:09 AM Yedidyah Bar David <[email protected]> >>>> wrote: >>>> >> >>>> >> On Sun, Sep 16, 2018 at 12:53 PM Yedidyah Bar David <[email protected]> >>>> wrote: >>>> >> > >>>> >> > On Fri, Sep 14, 2018 at 6:06 PM Martin Perina <[email protected]> >>>> wrote: >>>> >> > > >>>> >> > > >>>> >> > > >>>> >> > > On Fri, Sep 14, 2018 at 4:51 PM, Ravi Shankar Nori < >>>> [email protected]> wrote: >>>> >> > >> >>>> >> > >> I see the same errors on my dev env. From the logs attached by >>>> Andrej the response received by otopi has a bunch of null chars before the >>>> actual response CONFIRM DEPLOY_PROCEED=yes >>>> >> > >> >>>> >> > >> >>>> >> > >> >>>> >> > >> 2018-09-14 15:49:23,018+0200 DEBUG >>>> otopi.plugins.otopi.dialog.machine dialog.__logString:204 DIALOG:SEND >>>> ### Response is CONFIRM DEPLOY_PROCEED=yes|no or ABORT DEPLOY_PROCEED >>>> >> > >> >>>> >> > >> ^@^@^@^@^@^@^@^@^@CONFIRM DEPLOY_PROCEED=yes >>>> >> > > >>>> >> > > >>>> >> > > Didi/Sandro, could you please take a look? Below error seems >>>> like some issue in otopi, where an error is raised when handling binary >>>> input: >>>> >> > >>>> >> > Not sure the issue is "binary input" in general, but simply illegal >>>> >> > input. The prompt expects, as it says, one of these 3 replies: >>>> >> > >>>> >> > CONFIRM DEPLOY_PROCEED=yes >>>> >> > CONFIRM DEPLOY_PROCEED=no >>>> >> > ABORT DEPLOY_PROCEED >>>> >> > >>>> >> > Instead, judging from the file supplied by Andrej, it gets from >>>> the engine: >>>> >> > <7169 null bytes>CONFIRM DEPLOY_PROCEED=yes >>>> >> > >>>> >> > So either the engine now sends, for some reason, 7169 null bytes, >>>> in >>>> >> > this response, or there is some low-level change causing this to be >>>> >> > eventually supplied to otopi - a change in apache-sshd, openssh, >>>> some >>>> >> > library, the kernel, no idea. >>>> >> > >>>> >> > Well, thinking a bit, I have a wild guess: Perhaps it's related to >>>> the >>>> >> > patch introduced recently to change the tar blocking? >>>> >> >>>> >> https://gerrit.ovirt.org/94357 >>>> >> >>>> >> I am leaving soon, perhaps someone can try the manual job with the >>>> >> result of the check-patch job for above patch, to see if it fixes. >>>> >> Otherwise I'll do this tomorrow. >>>> >> >>>> >> > >>>> >> > > >>>> >> > > >>>> >> > > 2018-09-14 15:49:23,032+0200 DEBUG otopi.context >>>> context._executeMethod:143 method exception >>>> >> > > Traceback (most recent call last): >>>> >> > > File "/usr/lib/python2.7/site-packages/otopi/context.py", >>>> line 133, in _executeMethod >>>> >> > > method['method']() >>>> >> > > File "/tmp/ovirt-O6CfS4aUHI/otopi-p >>>> lugins/ovirt-host-deploy/core/misc.py", line 87, in _confirm >>>> >> > > prompt=True, >>>> >> > > File "/tmp/ovirt-O6CfS4aUHI/otopi-p >>>> lugins/otopi/dialog/machine.py", line 478, in confirm >>>> >> > > code=opcode, >>>> >> > > >>>> >> > > >>>> >> > >> >>>> >> > >> On Fri, Sep 14, 2018 at 10:44 AM, Dafna Ron <[email protected]> >>>> wrote: >>>> >> > >>> >>>> >> > >>> if you run it with mock you would remove any environmental >>>> conditions that can effect the outcome so I recommend using mock >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> On Fri, Sep 14, 2018 at 3:32 PM, Martin Perina < >>>> [email protected]> wrote: >>>> >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> >> > >>>> On Fri, Sep 14, 2018 at 3:49 PM, Dafna Ron <[email protected]> >>>> wrote: >>>> >> > >>>>> >>>> >> > >>>>> did you use mock to reproduce? >>>> >> > >>>> >>>> >> > >>>> >>>> >> > >>>> No, just run_suite under myself >>>> >> > >>>>> >>>> >> > >>>>> >>>> >> > >>>>> On Fri, Sep 14, 2018 at 2:39 PM, Martin Perina < >>>> [email protected]> wrote: >>>> >> > >>>>>> >>>> >> > >>>>>> Hi, >>>> >> > >>>>>> >>>> >> > >>>>>> the problem is that we haven't fetched the temporary >>>> host-deploy log from /tmp directory, so we don't know which string that >>>> host-deploy process sent to engine is causing that issue. I tried to >>>> reproduce on my local machine, but I was unable to reproduce it, >>>> 002_bootstrap phase finished successfully (other phases are still running). >>>> >> > >>>>>> >>>> >> > >>>>>> So if anyone is able to reproduce, please try to fetch >>>> host-deploy log from /tmp directory after the error is raised and share it. >>>> >> > >>>>>> >>>> >> > >>>>>> Thanks >>>> >> > >>>>>> >>>> >> > >>>>>> Martin >>>> >> > >>>>>> >>>> >> > >>>>>> >>>> >> > >>>>>> On Fri, Sep 14, 2018 at 1:52 PM, Dafna Ron <[email protected]> >>>> wrote: >>>> >> > >>>>>>> >>>> >> > >>>>>>> Full logs can be found here: >>>> >> > >>>>>>> >>>> >> > >>>>>>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ >>>> ovirt-master_change-queue-tester/10307/artifact/upgrade-from >>>> -release-suite.el7.x86_64/test_logs/upgrade-from-release- >>>> suite-master/post-002_bootstrap.py/ >>>> >> > >>>>>>> >>>> >> > >>>>>>> On Fri, Sep 14, 2018 at 12:48 PM, Dafna Ron < >>>> [email protected]> wrote: >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> Hi, >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> The previous regression was resolved and we now have a >>>> new regression. >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> I don't think that the reported change is related so can >>>> someone from ovirt-engine take a look? >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> The failure is add host on the upgrade suite. >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> Please note that we have not had an engine-ovirt build >>>> for over 10 days due to several consecutive regressions and I would ask you >>>> to stop merging until we can stabilize the project and have a new package >>>> of engine. >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> error: >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> 2018-09-14 05:51:07,670-04 INFO >>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] SSH execute >>>> 'root@lago-upgrade-from-release-suite-master-host-0' 'umask 0077; >>>> MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; >>>> trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > >>>> /dev/null 2>&1" 0; tar -b1 --warning=no-timestamp -C "${MYTMP}" -x && >>>> "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine >>>> DIALOG/customization=bool:True' >>>> >> > >>>>>>>> 2018-09-14 05:51:08,550-04 INFO >>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>>> (VdsDeploy) [5c91fcbd] EVENT_ID: VDS_INSTALL_IN_PROGRESS(509), Installing >>>> Host lago-upgrade-from-release-suite-master-host-0. Stage: >>>> Initializing. >>>> >> > >>>>>>>> 2018-09-14 05:51:08,565-04 INFO >>>> [org.ovirt.engine.core.utils.transaction.TransactionSupport] >>>> (VdsDeploy) [5c91fcbd] transaction rolled back >>>> >> > >>>>>>>> 2018-09-14 05:51:08,574-04 ERROR >>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) >>>> [5c91fcbd] Error during deploy dialog >>>> >> > >>>>>>>> 2018-09-14 05:51:08,578-04 ERROR >>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Error during host >>>> lago-upgrade-from-release-suite-master-host-0 install >>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04 ERROR >>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] EVENT_ID: >>>> VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during >>>> installation of Host lago-upgrade-from-release-suite-master-host-0: >>>> CallableStatementCallback; SQL [{call insertauditlog(?, ?, ?, ?, ?, ?, ?, >>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, >>>> ?, ?)}ERROR: invalid byte sequence for encoding "UTF8": 0x00; nested >>>> exception is org.postgresql.util.PSQLException: ERROR: invalid byte >>>> sequence for encoding "UTF8": 0x00. >>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04 ERROR >>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Error during host >>>> lago-upgrade-from-release-suite-master-host-0 install, preferring >>>> first exception: CallableStatementCallback; SQL [{call insertauditlog(?, ?, >>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, >>>> ?, ?, ?, ?, ?, ?, ?)}ERROR: invalid byte sequence for encoding "UTF8": >>>> 0x00; nested exception is org.postgresql.util.PSQLException: ERROR: >>>> invalid byte sequence for encoding "UTF8": 0x00 >>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04 ERROR >>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Host installation >>>> failed for host 'e475e93a-63b3-4573-b242-162c2ed864f0', >>>> 'lago-upgrade-from-release-suite-master-host-0': >>>> CallableStatementCallback; SQL [{call insertauditlog(?, ?, ?, ?, ?, ?, ?, >>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, >>>> ?, ?)}ERROR: invalid byte sequence for encoding "UTF8": 0x00; nested >>>> exception is org.postgresql.util.PSQLException: ERROR: invalid byte >>>> sequence for encoding "UTF8": 0x00 >>>> >> > >>>>>>>> 2018-09-14 05:51:08,615-04 INFO >>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] START, >>>> SetVdsStatusVDSCommand(HostName = >>>> lago-upgrade-from-release-suite-master-host-0, >>>> SetVdsStatusVDSCommandParameters:{hostId='e475e93a-63b3-4573-b242-162c2ed864f0', >>>> status='InstallFailed', nonOperationalReason='NONE', >>>> stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 146cdc08 >>>> >> > >>>>>>>> 2018-09-14 05:51:08,626-04 INFO >>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] FINISH, >>>> SetVdsStatusVDSCommand, return: , log id: 146cdc08 >>>> >> > >>>>>>>> 2018-09-14 05:51:08,639-04 ERROR >>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] EVENT_ID: >>>> VDS_INSTALL_FAILED(505), Host lago-upgrade-from-release-suite-master-host-0 >>>> installation failed. CallableStatementCallback; SQL [{call >>>> insertauditlog(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, >>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}ERROR: invalid byte sequence for >>>> encoding "UTF8": 0x00; nested exception is >>>> org.postgresql.util.PSQLException: >>>> ERROR: invalid byte sequence for encoding "UTF8": 0x00. >>>> >> > >>>>>>>> 2018-09-14 05:51:08,652-04 INFO >>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] >>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Lock freed to >>>> object >>>> 'EngineLock:{exclusiveLocks='[e475e93a-63b3-4573-b242-162c2ed864f0=VDS]', >>>> sharedLocks=''}' >>>> >> > >>>>>>>> 2018-09-14 05:51:37,996-04 INFO >>>> [org.ovirt.engine.core.bll.quota.QuotaManager] >>>> (EE-ManagedThreadFactory-engineScheduled-Thread-44) [] Quota Cache >>>> updated. (19 msec) >>>> >> > >>>>>>>> (END) >>>> >> > >>>>>>>> >>>> >> > >>>>>>>> Thanks, >>>> >> > >>>>>>>> Dafna >>>> >> > >>>>>>>> >>>> >> > >>>>>>> >>>> >> > >>>>>> >>>> >> > >>>>>> >>>> >> > >>>>>> >>>> >> > >>>>>> -- >>>> >> > >>>>>> Martin Perina >>>> >> > >>>>>> Associate Manager, Software Engineering >>>> >> > >>>>>> Red Hat Czech s.r.o. >>>> >> > >>>>> >>>> >> > >>>>> >>>> >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> >> > >>>> -- >>>> >> > >>>> Martin Perina >>>> >> > >>>> Associate Manager, Software Engineering >>>> >> > >>>> Red Hat Czech s.r.o. >>>> >> > >>> >>>> >> > >>> >>>> >> > >> >>>> >> > > >>>> >> > > >>>> >> > > >>>> >> > > -- >>>> >> > > Martin Perina >>>> >> > > Associate Manager, Software Engineering >>>> >> > > Red Hat Czech s.r.o. >>>> >> > >>>> >> > >>>> >> > >>>> >> > -- >>>> >> > Didi >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> Didi >>>> >>>> >>>> >>>> -- >>>> Didi >>>> >>> _______________________________________________ >>> Infra mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: https://www.ovirt.org/communit >>> y/about/community-guidelines/ >>> List Archives: https://lists.ovirt.org/archiv >>> es/list/[email protected]/message/CG2IYPXSSEFTL6XCN72JHUSWOUY7QRSA/ >>> >> >> >> -- >> >> GALIT ROSENTHAL >> >> SOFTWARE ENGINEER >> >> Red Hat >> >> <https://www.redhat.com/> >> >> [email protected] T: 972-9-7692230 >> <https://red.ht/sig> >> > > > _______________________________________________ > Infra mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community- > guidelines/ > List Archives: https://lists.ovirt.org/archives/list/[email protected]/ > message/QMRM2INTCRDPT7GPF24EEPNJAZRP4CUQ/ > >
_______________________________________________ Infra mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/ESPDPFTKJLGMUBNBMJMSH77K44DYS2JZ/
