> On July 13, 2014, 11:45 p.m., Benjamin Hindman wrote: > > src/tests/health_check_tests.cpp, line 159 > > <https://reviews.apache.org/r/23443/diff/1/?file=629112#file629112line159> > > > > I don't understand how changing to ASSERT from EXPECT would change the > > flakiness. > > Timothy Chen wrote: > Well currently the tests keeps moving on even when it fails expectation, > so just having a short circuit will try to do cleanup earlier. > I never was able to repro what the tests machine was seeing where it > hangs forever, but I do found issues doing a clean clone and test run so I'm > trying to fix those and see what's next. > > Benjamin Hindman wrote: > Sorry if I'm out of the loop here, but the tests were "hanging"? Sounds > like these tests have non-deterministic bugs that we need to be fixing first > and foremost before we consider re-enabling them or s/EXPECT/ASSERT/. Can you > update the JIRA ticket with more info on how the tests are now hanging?
Hmm so I believe it was reported the tests was hanging on CI occasinoally, but I never was able to repro on mac/ubuntu/debian. I have landed a fix and also make the tests fail fast now, and also this path bug as well. I really can't find or repro the original problems but I do notice with the existing bugs health check will not work and the task will be left running, but it should been cleaned up driver stop and Shutdown. Unless there is a way to test CI manually otherwise I'd like to take another shot with all these fixes to see if it can repro again. Let me know what you think. - Timothy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23443/#review47701 ----------------------------------------------------------- On July 14, 2014, 7:13 a.m., Timothy Chen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/23443/ > ----------------------------------------------------------- > > (Updated July 14, 2014, 7:13 a.m.) > > > Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen. > > > Bugs: MESOS-1533 > https://issues.apache.org/jira/browse/MESOS-1533 > > > Repository: mesos-git > > > Description > ------- > > Currently the health check tests are disabled as it was flaky. Part of the > reason is that all assertions was EXPECTs instead of ASSERTs so it will > continue to execute. > Another issue was it's currently discovering the path to launch health check > executable from the command executor main argv[0] path. > However, the correct path is to launch the generated script that does library > path resolution, which is the parent of .libs. > > Also changed the command executor to also include the health state when it > kills the task due to unhealthy checks. > > > Diffs > ----- > > src/launcher/executor.cpp 9c80848 > src/tests/health_check_tests.cpp 44711fd > > Diff: https://reviews.apache.org/r/23443/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Timothy Chen > >
