host-deploy is still broken on master fc28

On Mon, Sep 17, 2018 at 8:01 AM, Yuval Turgeman <[email protected]> wrote:

> I'm pretty sure I verified this on el7 as well, i'll check again, but
> thinking about it, tar will stop when it gets to the first empty block, so
> if the record size on the engine's side is large and the end is filled with
> zeros, -b1 will make it stop at the first empty block so the next read on
> the host's side would get the trailing zeros which is what otopi reads.
> Btw, it could be a problem with deployed el7 systems as well, if for any
> reason the default on the host is set to something that is more than 20
> blocks (can be set with export TAR_BLOCKING_FACTOR for the root account on
> the host side).
> It's ok to revert the patch to fix the regression, but I don't see any
> other way other than -b1... perhaps add a `cat -` after to just read until
> EOF or something, or have otopi strip the input.
>
> On Mon, Sep 17, 2018 at 2:30 PM, Galit Rosenthal <[email protected]>
> wrote:
>
>> Didi,
>>
>> Is this what you are looking for
>> https://ovirt-jira.atlassian.net/browse/OVIRT-2259
>> ?
>> Galit
>>
>> On Mon, Sep 17, 2018 at 1:54 PM Dafna Ron <[email protected]> wrote:
>>
>>> I think that in ovirt-engine we currently only build to centos.
>>> since we have not had an engine build for 2 weeks (on master) I think we
>>> should merge and worry about fc28 once it would be relevant.
>>>
>>> the failure we have now could be another regression missed since the
>>> project has been broken for two weeks.
>>>
>>> Thanks,
>>> Dafna
>>>
>>>
>>>
>>> On Mon, Sep 17, 2018 at 10:30 AM Yedidyah Bar David <[email protected]>
>>> wrote:
>>>
>>>> On Mon, Sep 17, 2018 at 11:49 AM Dafna Ron <[email protected]> wrote:
>>>> >
>>>> > Didi, Marin, any update on the patch?
>>>>
>>>> Yes - it passed. Actually failed, but only after host-deploy:
>>>>
>>>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/
>>>> ovirt-system-tests_manual/3189/
>>>>
>>>> I'd rather not merge it as-is, because it will break fedora.
>>>>
>>>> If someone can have a look at the code generating the tar file, and can
>>>> see if
>>>> it's easy to make it work well for both centos and fedora, perhaps by
>>>> explicitly
>>>> setting all relevant params to some reasonable values, great.
>>>> Otherwise, I guess
>>>> we can merge for now, as fedora is still not supported anyway.
>>>>
>>>> Thanks,
>>>>
>>>> >
>>>> >
>>>> > On Sun, Sep 16, 2018 at 11:09 AM Yedidyah Bar David <[email protected]>
>>>> wrote:
>>>> >>
>>>> >> On Sun, Sep 16, 2018 at 12:53 PM Yedidyah Bar David <[email protected]>
>>>> wrote:
>>>> >> >
>>>> >> > On Fri, Sep 14, 2018 at 6:06 PM Martin Perina <[email protected]>
>>>> wrote:
>>>> >> > >
>>>> >> > >
>>>> >> > >
>>>> >> > > On Fri, Sep 14, 2018 at 4:51 PM, Ravi Shankar Nori <
>>>> [email protected]> wrote:
>>>> >> > >>
>>>> >> > >> I see the same errors on my dev env. From the logs attached by
>>>> Andrej the response received by otopi has a bunch of null chars before the
>>>> actual response CONFIRM DEPLOY_PROCEED=yes
>>>> >> > >>
>>>> >> > >>
>>>> >> > >>
>>>> >> > >> 2018-09-14 15:49:23,018+0200 DEBUG
>>>> otopi.plugins.otopi.dialog.machine dialog.__logString:204 DIALOG:SEND
>>>>      ### Response is CONFIRM DEPLOY_PROCEED=yes|no or ABORT DEPLOY_PROCEED
>>>> >> > >>
>>>> >> > >> ^@^@^@^@^@^@^@^@^@CONFIRM DEPLOY_PROCEED=yes
>>>> >> > >
>>>> >> > >
>>>> >> > > Didi/Sandro, could you please take a look? Below error seems
>>>> like some issue in otopi, where an error is raised when handling binary
>>>> input:
>>>> >> >
>>>> >> > Not sure the issue is "binary input" in general, but simply illegal
>>>> >> > input. The prompt expects, as it says, one of these 3 replies:
>>>> >> >
>>>> >> > CONFIRM DEPLOY_PROCEED=yes
>>>> >> > CONFIRM DEPLOY_PROCEED=no
>>>> >> > ABORT DEPLOY_PROCEED
>>>> >> >
>>>> >> > Instead, judging from the file supplied by Andrej, it gets from
>>>> the engine:
>>>> >> > <7169 null bytes>CONFIRM DEPLOY_PROCEED=yes
>>>> >> >
>>>> >> > So either the engine now sends, for some reason, 7169 null bytes,
>>>> in
>>>> >> > this response, or there is some low-level change causing this to be
>>>> >> > eventually supplied to otopi - a change in apache-sshd, openssh,
>>>> some
>>>> >> > library, the kernel, no idea.
>>>> >> >
>>>> >> > Well, thinking a bit, I have a wild guess: Perhaps it's related to
>>>> the
>>>> >> > patch introduced recently to change the tar blocking?
>>>> >>
>>>> >> https://gerrit.ovirt.org/94357
>>>> >>
>>>> >> I am leaving soon, perhaps someone can try the manual job with the
>>>> >> result of the check-patch job for above patch, to see if it fixes.
>>>> >> Otherwise I'll do this tomorrow.
>>>> >>
>>>> >> >
>>>> >> > >
>>>> >> > >
>>>> >> > > 2018-09-14 15:49:23,032+0200 DEBUG otopi.context
>>>> context._executeMethod:143 method exception
>>>> >> > > Traceback (most recent call last):
>>>> >> > >   File "/usr/lib/python2.7/site-packages/otopi/context.py",
>>>> line 133, in _executeMethod
>>>> >> > >     method['method']()
>>>> >> > >   File "/tmp/ovirt-O6CfS4aUHI/otopi-p
>>>> lugins/ovirt-host-deploy/core/misc.py", line 87, in _confirm
>>>> >> > >     prompt=True,
>>>> >> > >   File "/tmp/ovirt-O6CfS4aUHI/otopi-p
>>>> lugins/otopi/dialog/machine.py", line 478, in confirm
>>>> >> > >     code=opcode,
>>>> >> > >
>>>> >> > >
>>>> >> > >>
>>>> >> > >> On Fri, Sep 14, 2018 at 10:44 AM, Dafna Ron <[email protected]>
>>>> wrote:
>>>> >> > >>>
>>>> >> > >>> if you run it with mock you would remove any environmental
>>>> conditions that can effect the outcome so I recommend using mock
>>>> >> > >>>
>>>> >> > >>>
>>>> >> > >>> On Fri, Sep 14, 2018 at 3:32 PM, Martin Perina <
>>>> [email protected]> wrote:
>>>> >> > >>>>
>>>> >> > >>>>
>>>> >> > >>>>
>>>> >> > >>>> On Fri, Sep 14, 2018 at 3:49 PM, Dafna Ron <[email protected]>
>>>> wrote:
>>>> >> > >>>>>
>>>> >> > >>>>> did you use mock to reproduce?
>>>> >> > >>>>
>>>> >> > >>>>
>>>> >> > >>>> No, just run_suite under myself
>>>> >> > >>>>>
>>>> >> > >>>>>
>>>> >> > >>>>> On Fri, Sep 14, 2018 at 2:39 PM, Martin Perina <
>>>> [email protected]> wrote:
>>>> >> > >>>>>>
>>>> >> > >>>>>> Hi,
>>>> >> > >>>>>>
>>>> >> > >>>>>> the problem is that we haven't fetched the temporary
>>>> host-deploy log from /tmp directory, so we don't know which string that
>>>> host-deploy process sent to engine is causing that issue. I tried to
>>>> reproduce on my local machine, but I was unable to reproduce it,
>>>> 002_bootstrap phase finished successfully (other phases are still running).
>>>> >> > >>>>>>
>>>> >> > >>>>>> So if anyone is able to reproduce, please try to fetch
>>>> host-deploy log from /tmp directory after the error is raised and share it.
>>>> >> > >>>>>>
>>>> >> > >>>>>> Thanks
>>>> >> > >>>>>>
>>>> >> > >>>>>> Martin
>>>> >> > >>>>>>
>>>> >> > >>>>>>
>>>> >> > >>>>>> On Fri, Sep 14, 2018 at 1:52 PM, Dafna Ron <[email protected]>
>>>> wrote:
>>>> >> > >>>>>>>
>>>> >> > >>>>>>> Full logs can be found here:
>>>> >> > >>>>>>>
>>>> >> > >>>>>>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/
>>>> ovirt-master_change-queue-tester/10307/artifact/upgrade-from
>>>> -release-suite.el7.x86_64/test_logs/upgrade-from-release-
>>>> suite-master/post-002_bootstrap.py/
>>>> >> > >>>>>>>
>>>> >> > >>>>>>> On Fri, Sep 14, 2018 at 12:48 PM, Dafna Ron <
>>>> [email protected]> wrote:
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> Hi,
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> The previous regression was resolved and we now have a
>>>> new regression.
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> I don't think that the reported change is related so can
>>>> someone from ovirt-engine take a look?
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> The failure is add host on the upgrade suite.
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> Please note that we have not had an engine-ovirt build
>>>> for over 10 days due to several consecutive regressions and I would ask you
>>>> to stop merging until we can stabilize the project and have a new package
>>>> of engine.
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> error:
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> 2018-09-14 05:51:07,670-04 INFO
>>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] SSH execute
>>>> 'root@lago-upgrade-from-release-suite-master-host-0' 'umask 0077;
>>>> MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)";
>>>> trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" >
>>>> /dev/null 2>&1" 0; tar -b1 --warning=no-timestamp -C "${MYTMP}" -x &&
>>>> "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine
>>>> DIALOG/customization=bool:True'
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,550-04 INFO
>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>> (VdsDeploy) [5c91fcbd] EVENT_ID: VDS_INSTALL_IN_PROGRESS(509), Installing
>>>> Host lago-upgrade-from-release-suite-master-host-0. Stage:
>>>> Initializing.
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,565-04 INFO
>>>> [org.ovirt.engine.core.utils.transaction.TransactionSupport]
>>>> (VdsDeploy) [5c91fcbd] transaction rolled back
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,574-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy)
>>>> [5c91fcbd] Error during deploy dialog
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,578-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Error during host
>>>> lago-upgrade-from-release-suite-master-host-0 install
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04 ERROR
>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] EVENT_ID:
>>>> VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during
>>>> installation of Host lago-upgrade-from-release-suite-master-host-0:
>>>> CallableStatementCallback; SQL [{call insertauditlog(?, ?, ?, ?, ?, ?, ?,
>>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>>> ?, ?)}ERROR: invalid byte sequence for encoding "UTF8": 0x00; nested
>>>> exception is org.postgresql.util.PSQLException: ERROR: invalid byte
>>>> sequence for encoding "UTF8": 0x00.
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Error during host
>>>> lago-upgrade-from-release-suite-master-host-0 install, preferring
>>>> first exception: CallableStatementCallback; SQL [{call insertauditlog(?, ?,
>>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>>> ?, ?, ?, ?, ?, ?, ?)}ERROR: invalid byte sequence for encoding "UTF8":
>>>> 0x00; nested exception is org.postgresql.util.PSQLException: ERROR:
>>>> invalid byte sequence for encoding "UTF8": 0x00
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Host installation
>>>> failed for host 'e475e93a-63b3-4573-b242-162c2ed864f0',
>>>> 'lago-upgrade-from-release-suite-master-host-0':
>>>> CallableStatementCallback; SQL [{call insertauditlog(?, ?, ?, ?, ?, ?, ?,
>>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>>> ?, ?)}ERROR: invalid byte sequence for encoding "UTF8": 0x00; nested
>>>> exception is org.postgresql.util.PSQLException: ERROR: invalid byte
>>>> sequence for encoding "UTF8": 0x00
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,615-04 INFO
>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] START,
>>>> SetVdsStatusVDSCommand(HostName = 
>>>> lago-upgrade-from-release-suite-master-host-0,
>>>> SetVdsStatusVDSCommandParameters:{hostId='e475e93a-63b3-4573-b242-162c2ed864f0',
>>>> status='InstallFailed', nonOperationalReason='NONE',
>>>> stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 146cdc08
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,626-04 INFO
>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] FINISH,
>>>> SetVdsStatusVDSCommand, return: , log id: 146cdc08
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,639-04 ERROR
>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] EVENT_ID:
>>>> VDS_INSTALL_FAILED(505), Host lago-upgrade-from-release-suite-master-host-0
>>>> installation failed. CallableStatementCallback; SQL [{call
>>>> insertauditlog(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}ERROR: invalid byte sequence for
>>>> encoding "UTF8": 0x00; nested exception is 
>>>> org.postgresql.util.PSQLException:
>>>> ERROR: invalid byte sequence for encoding "UTF8": 0x00.
>>>> >> > >>>>>>>> 2018-09-14 05:51:08,652-04 INFO
>>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Lock freed to
>>>> object 
>>>> 'EngineLock:{exclusiveLocks='[e475e93a-63b3-4573-b242-162c2ed864f0=VDS]',
>>>> sharedLocks=''}'
>>>> >> > >>>>>>>> 2018-09-14 05:51:37,996-04 INFO
>>>> [org.ovirt.engine.core.bll.quota.QuotaManager]
>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-44) [] Quota Cache
>>>> updated. (19 msec)
>>>> >> > >>>>>>>> (END)
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>> Thanks,
>>>> >> > >>>>>>>> Dafna
>>>> >> > >>>>>>>>
>>>> >> > >>>>>>>
>>>> >> > >>>>>>
>>>> >> > >>>>>>
>>>> >> > >>>>>>
>>>> >> > >>>>>> --
>>>> >> > >>>>>> Martin Perina
>>>> >> > >>>>>> Associate Manager, Software Engineering
>>>> >> > >>>>>> Red Hat Czech s.r.o.
>>>> >> > >>>>>
>>>> >> > >>>>>
>>>> >> > >>>>
>>>> >> > >>>>
>>>> >> > >>>>
>>>> >> > >>>> --
>>>> >> > >>>> Martin Perina
>>>> >> > >>>> Associate Manager, Software Engineering
>>>> >> > >>>> Red Hat Czech s.r.o.
>>>> >> > >>>
>>>> >> > >>>
>>>> >> > >>
>>>> >> > >
>>>> >> > >
>>>> >> > >
>>>> >> > > --
>>>> >> > > Martin Perina
>>>> >> > > Associate Manager, Software Engineering
>>>> >> > > Red Hat Czech s.r.o.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > --
>>>> >> > Didi
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Didi
>>>>
>>>>
>>>>
>>>> --
>>>> Didi
>>>>
>>> _______________________________________________
>>> Infra mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct: https://www.ovirt.org/communit
>>> y/about/community-guidelines/
>>> List Archives: https://lists.ovirt.org/archiv
>>> es/list/[email protected]/message/CG2IYPXSSEFTL6XCN72JHUSWOUY7QRSA/
>>>
>>
>>
>> --
>>
>> GALIT ROSENTHAL
>>
>> SOFTWARE ENGINEER
>>
>> Red Hat
>>
>> <https://www.redhat.com/>
>>
>> [email protected]    T: 972-9-7692230
>> <https://red.ht/sig>
>>
>
>
> _______________________________________________
> Infra mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-
> guidelines/
> List Archives: https://lists.ovirt.org/archives/list/[email protected]/
> message/QMRM2INTCRDPT7GPF24EEPNJAZRP4CUQ/
>
>
_______________________________________________
Infra mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/ESPDPFTKJLGMUBNBMJMSH77K44DYS2JZ/

Reply via email to