For what it's worth, this specific failure (GEODE-6091) was a problem with the PR pipeline using an old test container image. This should have been corrected yesterday morning.
On Mon, Nov 26, 2018 at 10:55 AM Helena Bales <hba...@pivotal.io> wrote: > For test failures that do not have a ticket for that particular stack > trace, you should re-trigger your pre-checkin. If the test fails again, > your change probably caused it to start failing, even if it doesn't seem > related (like the unit test ordering issue we had Friday before last), and > you would be expected to fix it before committing. > If the test passes your next run of tests the process is less established. > Ideally we would never encounter this case because it means we have checked > in flaky code at some point before your branch off of develop, which > StressNewTest should catch. We will probably find a few of these because > the StressNewTest job was silently failing for a little while. It is > working again, but we may have checked in some flaky tests during the time > it was down. > > I propose that we follow the CIO process with these. The process would be > as follows: > 1. Create a Jira ticket for the failure with links to the relevant > resources from the failing CI run. Include any evidence that you have > towards it being a flaky test that exists on develop and not just in your > branch, and evidence that it was not made flaky by your change. > 2. See if that test file was changed recently. If it was, talk to the > person that changed it. > 3. If it wasn't changed recently, post a link to gemfire-green-ci > 4. Comment on your pull request with an update on the status of the failing > test in your run. > > That is just an idea based on what we are doing for the develop pipeline. > It seems like we are having more failures in PR pipelines than we were > seeing a couple of weeks ago, and some tests that seem to only fail in the > PR pipelines, so it might be time to start tracking some of this stuff. My > main concern with this method is that it might pollute our backlog with > tickets that no one is ever going to look at. > > On Mon, Nov 26, 2018 at 10:19 AM Kirk Lund <kl...@apache.org> wrote: > > > I just saw SizingFlagDUnitTest fail in a precheckin but it passes on my > > branch when I run directly. I cannot find a Jira ticket for it. What's > the > > new process for handling these flickering tests? > > > > See: > > https://concourse.apachegeode-ci.info/builds/17745 > > > > Test failure stack: > > org.apache.geode.internal.cache.SizingFlagDUnitTest > > > testPRHeapLRUDeltaPutOnPrimary FAILED > > org.apache.geode.test.dunit.RMIException: While invoking > > org.apache.geode.internal.cache.SizingFlagDUnitTest$12.run in VM 0 > running > > on Host eb7aca4f2587 with 4 VMs > > at org.apache.geode.test.dunit.VM.invoke(VM.java:433) > > at org.apache.geode.test.dunit.VM.invoke(VM.java:402) > > at org.apache.geode.test.dunit.VM.invoke(VM.java:361) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest.assertValueType(SizingFlagDUnitTest.java:793) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest.doPRDeltaTestLRU(SizingFlagDUnitTest.java:312) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest.testPRHeapLRUDeltaPutOnPrimary(SizingFlagDUnitTest.java:220) > > > > Caused by: > > org.apache.geode.cache.EntryNotFoundException: Entry not found > for > > key 0 > > at > > > > > org.apache.geode.internal.cache.LocalRegion.checkEntryNotFound(LocalRegion.java:2760) > > at > > > > > org.apache.geode.internal.cache.LocalRegion.nonTXbasicGetValueInVM(LocalRegion.java:3448) > > at > > > > > org.apache.geode.internal.cache.LocalRegionDataView.getValueInVM(LocalRegionDataView.java:105) > > at > > > > > org.apache.geode.internal.cache.LocalRegion.basicGetValueInVM(LocalRegion.java:3436) > > at > > > > > org.apache.geode.internal.cache.LocalRegion.getValueInVM(LocalRegion.java:3424) > > at > > > > > org.apache.geode.internal.cache.PartitionedRegionDataStore.getLocalValueInVM(PartitionedRegionDataStore.java:2775) > > at > > > > > org.apache.geode.internal.cache.PartitionedRegion.getValueInVM(PartitionedRegion.java:8786) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest$12.run(SizingFlagDUnitTest.java:797) > > >