On Sun, Dec 18, 2016 at 7:17 PM, Nir Soffer <[email protected]> wrote:
> On Sun, Dec 18, 2016 at 6:08 PM, Barak Korren <[email protected]> wrote: > > On 18 December 2016 at 17:26, Nir Soffer <[email protected]> wrote: > >> On Sun, Dec 18, 2016 at 4:17 PM, Barak Korren <[email protected]> > wrote: > > > >> We a lot of these errors in the rest of the log. This meas something > >> is wrong with this vg. > >> > >> Needs deeper investigation from storage developer on both engine and > vdsm side, > >> but I would start by making sure we use clean luns. We are not trying > >> to test esoteric > >> negative flows in the system tests. > > > > Here is the storage setup script: > > https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests. > git;a=blob;f=common/deploy-scripts/setup_storage_unified_ > he_extra_iscsi_el7.sh;hb=refs/heads/master > > 25 iscsiadm -m discovery -t sendtargets -p 127.0.0.1 > 26 iscsiadm -m node -L all > > This is alerting. Before we serve these luns, we should log out > from these nodes, and remove the nodes. > This is show a non-up-to-date (or I have to update it) code. In an updated code, where it also happens, we do the following as well: iscsiadm -m node -U all iscsiadm -m node -o delete systemctl stop iscsi.service systemctl disable iscsi.service > > All storage used in the system tests comes from the engine VM itself, > > and is placed on a newly allocated QCOW2 file (exposed as /dev/sde to > > the engine VM), so its unlikely the LUNs are not clean. > > We did not change code related to getDeviceList lately, these getPV errors > tell us that there is an issue in a lower level component or the storage > server. > > Does this test pass with older version of vdsm? engine? > We did not test that. It's not very easy to do it in ovirt-system-tests, though I reckon it is possible with some additional work. Note that I suspect cold and live merge were not actually tested for ages / ever in ovirt-system-tests. > > >> Did we change something in the system tests project or lago while we > >> were not looking? > Mainly CentOS 7.2 -> CentOS 7.3 change. > > > > Not likely as well: > > https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests.git;a=shortlog > > > > ovirt-system-tests project has got its own CI, testing against the > > last nigthly (we will move it to last build that passed the tests > > soon). So we are unlikely to merge breaking code there. > > It depends on the tests. > > Do you have test logging in to the target and creating a vg using > the luns? > > > Then again > > we're not gating the OS packages so some breakage may have gone in via > > CentOS repos... > > These failures are with centos 7.2 or 7.3? both? > Unsure. > > >> Can we reproduce this issue manually with same engine and vdsm versions? > > > > You have several options: > > 1: Get engine+vdsm builds from Jenkins: > > http://jenkins.ovirt.org/job/ovirt-engine_master_build- > artifacts-fc24-x86_64/ > > http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-x86_64/ > > (Getting the exact builds that went into a given OST run takes tracing > > back the job invocation links from that run) > > > > 2: Use the latest experimental repo: > > http://resources.ovirt.org/repos/ovirt/experimental/ > master/latest/rpm/el7/ > > > > 3: Run lago and OST locally: > > (as documented here: > > http://ovirt-system-tests.readthedocs.io/en/latest/ > > you'd need to pass in the vdsm and engine packages to use) > That's what I do, on a daily basis. > > Do you know how to setup the system so it run all the setup code up to > the code that cause the getPV errors? > Yes, that should be fairly easy to do. > > We need to inspect the system at this point. > Let me know and I'll set up a live system quickly tomorrow. Y. > > Nir >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
