Do you know which tests are flaky? And are JIRA tickets open for any ones that are? If not, let’s get some opened.
From: [email protected] [mailto:[email protected]] On Behalf Of Ry Jones Sent: Tuesday, March 8, 2016 10:43 AM To: Daniel Mihai <[email protected]> Cc: [email protected] Subject: Re: [Allseen-core] SCL tests failing I made several changes. 1) for OS X builds, I set the loop to happen once. 2) I made sure on the OS X and linux builds, the correct variable was set on an error condition 3) for the linux builds, I set the iteration on test failure to 3 instead of 10. I started a couple of builds - depending on results, I will probably set them down to one iteration. These retry loops were added because of transient test failures causing false positives. My position is if a test case is flaky, we need to fix that test case. If you would like I'll remove all retry logic from all verify builds. I don't think, on the whole, retrying these tests has saved us any heartburn. On Tue, Mar 8, 2016 at 9:41 AM, Daniel Mihai <[email protected]<mailto:[email protected]>> wrote: Thanks guys! Ry/Josh, should we also remove the ajtest retry loops from osx-verify builds? For example: 1. This one seems to catch a significant problem in RemoteEndpointTest.AbortiveRelease 2. Ignores #1, retries running ajtest and hits some other problem – was it a timeout caused by retrying?!? https://build.allseenalliance.org/ci/job/osx-verify/3659/consoleText<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fbuild.allseenalliance.org%2fci%2fjob%2fosx-verify%2f3659%2fconsoleText&data=01%7c01%7ckkane%40microsoft.com%7c9abc91f142aa4180c17008d347816bcc%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=gUFSl7Y%2bNTHvjHxqQEhJEKiHQ7vG4q5kUJQch%2f1QSkE%3d> Josh, thanks for looking at ASACORE-2725. Fyi, I am also looking at another ajtest deadlock that we’ve hit on Linux: https://jira.allseenalliance.org/browse/ASACORE-2730<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fjira.allseenalliance.org%2fbrowse%2fASACORE-2730&data=01%7c01%7ckkane%40microsoft.com%7c9abc91f142aa4180c17008d347816bcc%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=nxidDw%2bP5EI8Y7mSceEni6aQHYlL42opxBY99Vf6rLI%3d>. Thanks! From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Ry Jones Sent: Tuesday, March 8, 2016 9:30 AM To: Josh Spain <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: [Allseen-core] SCL tests failing I'm looking in to this. These builds should be getting marked as failures and stopped automatically. On Tue, Mar 8, 2016 at 8:22 AM, Josh Spain <[email protected]<mailto:[email protected]>> wrote: Can anyone advise in this situation? Whose responsibility is it to make sure Jenkins flags these as build failures? Should a failure of ajtest mark a build as failed? Is that true for only certain jobs but not others? Regarding builds hanging: We are currently looking into ASACORE-2725 to solve that problem. Meanwhile, @Ry, how do we deal with the situation where builds have hung? Can we merely cancel the build in Jenkins? Does something need to happen on the build system itself to stop the process? Thanks, Josh On Mon, Mar 7, 2016 at 3:43 PM, Josh Spain <[email protected]<mailto:[email protected]>> wrote: The last "Successful" build (according to Jenkins) of linux-test-verify failed ajtest: https://build.allseenalliance.org/ci/job/linux-test-verify/2538/consoleFull<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fbuild.allseenalliance.org%2fci%2fjob%2flinux-test-verify%2f2538%2fconsoleFull&data=01%7c01%7cDaniel.Mihai%40microsoft.com%7ccece41681f94423e5f0008d3477747d8%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=fNGUywnHhjWrxOZpC9tdpJAKTwB5r3DiUnL04HJ%2bjHo%3d> (do a search for "FAILED TESTS"). I'm not sure why Jenkins isn't flagging it. Also, this job is trying to run right now but appears to be hung. -Josh On Mon, Mar 7, 2016 at 3:22 PM, Pawel Winogrodzki <[email protected]<mailto:[email protected]>> wrote: I’ve added these test in https://git.allseenalliance.org/gerrit/#/c/6915<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgit.allseenalliance.org%2fgerrit%2f%23%2fc%2f6915&data=01%7c01%7cDaniel.Mihai%40microsoft.com%7ccece41681f94423e5f0008d3477747d8%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=U6Mea7LMLW87pZwEuhb3cSBrupY8UkB%2b5im1k3Kz750%3d> and they passed the “linux-test-verify” and “linux-gcc46-test-verify” builds. I’m checking what might be wrong with “master-linux-sdk”. From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Josh Spain Sent: Monday, March 7, 2016 11:50 To: [email protected]<mailto:[email protected]> Subject: [Allseen-core] SCL tests failing There are several tests failing in SCL on Linux. I do not know how long they have been failing, but several of us saw the problem last week when running ajtest. These were passing as of Feb 10th. Since then no master-linux-sdk builds were done in Jenkins until yesterday (March 6th). See https://build.allseenalliance.org/ci/job/master-linux-sdk/1197/console<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fbuild.allseenalliance.org%2fci%2fjob%2fmaster-linux-sdk%2f1197%2fconsole&data=01%7c01%7cpawelwi%40microsoft.com%7c46e6005c3c57488859c608d346c1a87b%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=BLsnuVe9jVaMVpeB5hNmC%2bh2zMiIv%2bFmi%2fpwTCHuUuY%3d> for yesterday's build. Note also that yesterday's build "Succeeded" even though ajtest failed. I'm not sure if this is intended, but I suspect we should fail the build if that happens. Here is the summary of failures: [----------] Global test environment tear-down [==========] 1076 tests from 84 test cases ran. (800244 ms total) [ PASSED ] 1070 tests. [ FAILED ] 6 tests, listed below: [ FAILED ] XmlRulesConverterToXmlDetailedPassTest.shouldGetSameRulesAfterTwoConversions [ FAILED ] XmlRulesConverterToXmlDetailedPassTest.shouldGetSameXmlAfterTwoConversions [ FAILED ] XmlRulesConverterToXmlDetailedPassTest.shouldGetValidMethodForValidAllCasesManifestTemplate [ FAILED ] XmlRulesConverterToXmlDetailedPassTest.shouldGetValidPropertyForValidNeedAllManifestTemplate [ FAILED ] XmlRulesConverterToXmlDetailedPassTest.shouldGetValidSignalForValidNeedAllManifestTemplate [ FAILED ] XmlRulesConverterToXmlDetailedPassTest.shouldGetValidSpecificNodeName Does anyone recognize these tests and know which changeset may have broken this? Thanks, Josh
_______________________________________________ Allseen-core mailing list [email protected] https://lists.allseenalliance.org/mailman/listinfo/allseen-core
