On 3/23/20 3:10 PM, Marcin Sobczyk wrote:


On 3/23/20 2:53 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <[email protected]> wrote:


On 3/23/20 2:17 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <[email protected]> wrote:

On 3/21/20 1:18 AM, Nir Soffer wrote:

On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <[email protected]> wrote:
Looks like infrastructure issue setting up storage on engine host.

Here are 2 failing builds with unrelated changes:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
Rebuilding still fails in setup_storage:

https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/

Is this a known issue?

Error Message

AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0   -1   +0

Stacktrace

prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>

      @pytest.mark.run(order=14)
      def test_configure_storage(prefix):
          engine = prefix.virt_env.engine_vm()
          result = engine.ssh(
              [
                  '/tmp/setup_storage.sh',
              ],
          )
        assert result.code == 0, 'setup_storage.sh failed. Exit code is %s' % result.code
E       AssertionError: setup_storage.sh failed. Exit code is 1
E       assert 1 == 0
E         -1
E         +0


The pytest traceback is nice, but in this case it is does not show any useful info.

Since we run a script using ssh, the error message should include the process stdout and stderr
which probably can explain the failure.
I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Thanks.

It still fails for me with current code:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/

Same when using current vdsm master.
Updated the patch according to your suggestions and currently trying out
OST for the 4th time -
all previous runs succeeded. I guess I'm out of luck :)
It succeeds on your local OST setup but fail on Jenkins?
No, I mean jenkins - both check-patch runs didn't fail on this script.
I also tried running OST manually twice and same thing happened.
Anyway - the patch has been merged now so if any failure occurs in CQ
we should know what's going on.
Ok, finally caught a failure in CQ [1]:

[2020-03-23T14:14:09.836Z]         if result.code != 0:
[2020-03-23T14:14:09.836Z]             msg = (
[2020-03-23T14:14:09.836Z]                 'setup_storage.sh failed with exit code: {}.\n'
[2020-03-23T14:14:09.836Z]                 'stdout:\n{}'
[2020-03-23T14:14:09.836Z]                 'stderr:\n{}'
[2020-03-23T14:14:09.836Z]             ).format(result.code, result.out, result.err)
[2020-03-23T14:14:09.836Z] >           raise RuntimeError(msg)
[2020-03-23T14:14:09.836Z] E           RuntimeError: setup_storage.sh failed with exit code: 1.
[2020-03-23T14:14:09.836Z] E           stdout:
[2020-03-23T14:14:09.836Z] E           Reposync & Extra Sources Content                0.0  B/s |   0  B     00:00
[2020-03-23T14:14:09.836Z] E           stderr:
[2020-03-23T14:14:09.836Z] E           + set -xe
[2020-03-23T14:14:09.836Z] E           + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E           + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3
[2020-03-23T14:14:09.836Z] E           + NUM_LUNS=5
[2020-03-23T14:14:09.836Z] E           ++ uname -r
[2020-03-23T14:14:09.836Z] E           ++ awk -F. '{print $(NF-1)}'
[2020-03-23T14:14:09.836Z] E           + DIST=el8_1
[2020-03-23T14:14:09.836Z] E           + main
[2020-03-23T14:14:09.836Z] E           ++ hostname
[2020-03-23T14:14:09.836Z] E           + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]]
[2020-03-23T14:14:09.836Z] E           + install_deps
[2020-03-23T14:14:09.836Z] E           + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E           Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E           + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E           Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E           Error: Failed to download metadata for repo 'alocalsync'


[1] https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-queue-tester/detail/ovirt-master_change-queue-tester/21420/pipeline



Also I wonder why this code is called as a test (test_configure_storage). This looks like setup
step so it should run as a fixture.
That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented
incrementally in the nearest future.
Understood


_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/7BY3BUM2R7SQWHTKU7RJOOLPQJYH6442/

Reply via email to