Lior, just a note - regardless of if this error is a false positive or not, the infra team is very much understaffed (single man operation almost), at least when it comes to jenkins stability, so it's not un-common if there are infra issues and it takes some time to resolve them. So as david proposed (on a different thread i think)- if you think a certain job shouldn't fail, please don't block on it. Talk to your manager or request the right permissions to remove it from gerrit if you're a maintainer/TL, but please make sure you do validate it locally before merging.
We're planning a full day infra hackathon soon to address many issues on the oVirt infra, and everyone will be able to assist if they can/like - an official email with details will be sent in the following days. Eyal. ----- Original Message ----- > From: "David Caro" <[email protected]> > To: "Lior Vernia" <[email protected]> > Cc: [email protected] > Sent: Monday, November 24, 2014 4:27:00 PM > Subject: Re: DAO tests failing > > On 11/24, Lior Vernia wrote: > > > > > > On 24/11/14 14:25, David Caro wrote: > > > On 11/24, Lior Vernia wrote: > > >> > > >> > > >> On 24/11/14 14:16, David Caro wrote: > > >>> On 11/24, Allon Mureinik wrote: > > >>>> Looks like recreating the database fails: > > >>>> > > >>>> 08:54:54 [ovirt-engine_master_dao-unit-tests_created] $ /bin/sh > > >>>> /tmp/hudson8811709354471707251.sh > > >>>> 08:54:54 > > >>>> /home/jenkins/workspace/ovirt-engine_master_dao-unit-tests_created > > >>>> 08:54:54 > > >>>> /home/jenkins/workspace/ovirt-engine_master_dao-unit-tests_created > > >>>> 08:54:55 could not change directory to > > >>>> "/home/jenkins/workspace/ovirt-engine_master_dao-unit-tests_created" > > >>>> 08:54:55 ERROR: role "engine" already exists > > >>>> 08:54:55 could not change directory to > > >>>> "/home/jenkins/workspace/ovirt-engine_master_dao-unit-tests_created" > > >>>> 08:54:55 ALTER ROLE > > >>>> 08:54:56 could not change directory to > > >>>> "/home/jenkins/workspace/ovirt-engine_master_dao-unit-tests_created" > > >>>> 08:54:56 dropdb: database removal failed: ERROR: database "engine" > > >>>> does not exist > > >>>> 08:54:56 could not change directory to > > >>>> "/home/jenkins/workspace/ovirt-engine_master_dao-unit-tests_created" > > >>>> 08:54:58 CREATE DATABASE > > >>>> 08:54:59 Creating schema > > >>>> engine@localhost:5432/ovirt_engine_master_dao_unit_tests_created_6082 > > >>>> > > >>> > > >>> Those are not errors, are just unfiltered messages when it tries to > > >>> make sure the database is not there and the user has enough rights at > > >>> the startup of the job. > > >>> > > >>> The first issue is: > > >>> > > >>> 00:09:49.369 2014-11-24 09:03:47,301 SEVERE > > >>> [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback > > >>> doInConnection] Can't execute batch: Batch entry 0 select * from > > >>> public.insertnumanode(CAST ('830fd8d0-9332-4d81-bb80-54beee7d5b59' AS > > >>> uuid),CAST (NULL AS uuid),CAST ('77296e00-0cad-4e5a-9299-008a7b6f4355' > > >>> AS uuid),CAST ('0' AS int2),CAST ('0' AS int8),CAST ('4' AS int2),CAST > > >>> (NULL AS int8),CAST (NULL AS int4),CAST (NULL AS numeric),CAST (NULL > > >>> AS numeric),CAST (NULL AS numeric),CAST (NULL AS int4),CAST (NULL AS > > >>> text)) as result was aborted. Call getNextException to see the cause. > > >>> 00:09:49.441 2014-11-24 09:03:47,303 SEVERE > > >>> [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback > > >>> doInConnection] Can't execute batch. Next exception is: ERROR: insert > > >>> or update on table "numa_node" violates foreign key constraint > > >>> "fk_numa_node_vm" > > >>> > > >>> What to me looks like a real issue. I'll do a couple more checks, but > > >>> I don't know how the tests work or what the code does, that's your > > >>> domain. > > >>> > > >> > > >> I don't know if that specifically is a real issue or not (as it isn't > > >> related to my patch and I haven't researched it), I do see however that > > >> there are ultimately 433 failures and 371 errors, and that is spread > > >> across many independent tests. > > > > > > You should be aware that it's not just your patch that runs, but all > > > the patches yours depends on, that I see are quite a lot, are you sure > > > that none of them introduces those failures? > > > > > > > It had failed the same way when it didn't depend on anything unmerged - > > I recently rebased it on two other patches so I could merge them first > > (and neither of them touches the dal project). > > > > What do you mean by "quite a lot"? Do you see more than two? > > From the gerrit page I see that the patch depended on other 5 patches, > 3 of those are already merged. So no, right now I only see two. > > > > Could these be environmental issues? > > There's always a possibility, but I think that it's quite improbable > in this case. > > > It's possible that my patch causes all of this, > > but I don't see how. Could it be the Allon's got it right, and the DB > > isn't being constructed properly? > > The DB is being constructed as part of the test. The messages Allon > pointed out are not errors but just unfiltered messages (it just drops > the db directly instead of checking if it exists first for example). > I don't see any issues on the db creation. > > ERROR: insert or update on table "network" violates foreign key > constraint "fk_network_qos_id" Detail: Key > (qos_id)=(de956031-6be2-43d6-bb90-5191c9253318) is not present in > table "qos". > > > This failure is quite a specific one, you can try looking for the > point where that key should be added. It looks like you are trying to > add an entry to the network table with an id from the qos that does > not exist. Most of the other failures also seem related to foreign > keys not being consistent. > > That does not seem like an environmental issue. The db did not exist > at the start of the test qnd it works well when run from branch HEAD, > so it's an issue that's introduced by your patch or any it depends > on. Keep in mind that it does a checkout of the change and not a > rebase. > > > I see that the first failure (if I'm not mistaken) seems to be: > > 00:02:38.691 Running > org.ovirt.engine.core.dao.VmAndTemplatesGenerationsDaoTest > 00:02:39.007 Tests run: 18, Failures: 13, Errors: 1, Skipped: 0, Time > elapsed: 0.325 sec <<< FAILURE! > > What might lead to an entry not being created on the db and the other > failures. > > I don't know the code or the tests code, so I don't think I can help > you more than that. > > > > > > >> > > >>>> > > >>>> > > >>>> ----- Original Message ----- > > >>>>> From: "Lior Vernia" <[email protected]> > > >>>>> To: [email protected] > > >>>>> Sent: Monday, November 24, 2014 1:37:25 PM > > >>>>> Subject: DAO tests failing > > >>>>> > > >>>>> Hi, > > >>>>> > > >>>>> I've noticed recurrent failures of DAO tests on one of my patches, > > >>>>> this > > >>>>> looks like something systematic as there are hundreds of failures in > > >>>>> unrelated files. > > >>>>> > > >>>>> http://gerrit.ovirt.org/#/c/34121/ > > >>>>> > > >>>>> Talked to dcaro about it on Thursday, but have been rebasing and > > >>>>> re-running the tests and they keep failing. > > >>>>> > > >>>>> Thanks, Lior. > > >>>>> _______________________________________________ > > >>>>> Infra mailing list > > >>>>> [email protected] > > >>>>> http://lists.ovirt.org/mailman/listinfo/infra > > >>>>> > > >>>> _______________________________________________ > > >>>> Infra mailing list > > >>>> [email protected] > > >>>> http://lists.ovirt.org/mailman/listinfo/infra > > >>> > > > > > -- > David Caro > > Red Hat S.L. > Continuous Integration Engineer - EMEA ENG Virtualization R&D > > Tel.: +420 532 294 605 > Email: [email protected] > Web: www.redhat.com > RHT Global #: 82-62605 > > _______________________________________________ > Infra mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/infra > _______________________________________________ Infra mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/infra
