> On Aug. 13, 2015, 4:22 p.m., Till Toenshoff wrote: > > src/examples/java/TestFramework.java, line 275 > > <https://reviews.apache.org/r/37415/diff/1/?file=1038842#file1038842line275> > > > > Why 500 - not 200 or 1000? If I am not mistaken, then using 500 worked > > fine on your test-machines but chances are that someone comes in with a > > heavily loaded machine where those 500ms might not do, correct? Let us be > > honest and let the audience know that this is just a value chosen out of > > some test runs done - again, if I did not miss something :) > > > > I think it would also be a great idea to add a todo here that captures > > what you wrote within the JIRA issue already; > > "TODO(*): Ideally, we would inspect the status of the driver and its > > associated tasks via the Java API and wait until their teardown is complete > > to exit." > > Greg Mann wrote: > You are correct! :-) Thanks, comments addressed. > > Joseph Wu wrote: > Agreed. I don't think a sleep if a good enough solution. > > (Note: This next part is just supposition.) > > If you take a look at the comment here: > > https://github.com/apache/mesos/blob/437e7018e184128c9ec7c4113da09bbf9914721f/src/java/jni/org_apache_mesos_MesosSchedulerDriver.cpp#L485 > > You'll notice that the SchedulerDriver is instantiated as a weak > reference. So the GC is relied upon to destroy the object. > See: > https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#weak > > Have you tried changing it to a normal global reference and adding the > explicit cleanup code into the `stop` function? > > Greg Mann wrote: > Yea, I'm not crazy about the sleep solution either. I just tried your > suggestion, Joseph, putting a DeleteGlobalRef() call in the stop() method, > and it didn't work unfortunately. I think that short of an explicit > confirmation message upon teardown completion (which doesn't currently exist, > and is perhaps not advisable anyway), I don't see a way of resolving this in > an elegant fashion. Even if we're sure that the SchedulerDriver is gone > before we call System.exit(), the associated Executors and the Slave are > doing logging as they shut down, and it's the premature destruction of the > mutexes used by glog that is causing the problem.
Well, @joseph - the comment you point to, pretty much negates that option, doesn't it? // Create a weak global reference to the MesosSchedulerDriver // instance (we want a global reference so the GC doesn't collect // the instance but we make it weak so the JVM can exit). Bearing in mind that this is test code, which is only run at exit (ie, one-off and not in a performance-critical section) I think that adding a sleep is acceptable (if not exciting). Sure, a callback that says "hey, I'm done now, you can go away" would be great, but I'd guess, probably overkill. Totally +1 to the comment and the TODO - hard-coded 'magic numbers" are A Bad Thing, so definitely worth going the extra mile and explaining them. - Marco ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37415/#review95264 ----------------------------------------------------------- On Aug. 13, 2015, 4:49 p.m., Greg Mann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/37415/ > ----------------------------------------------------------- > > (Updated Aug. 13, 2015, 4:49 p.m.) > > > Review request for mesos, Joris Van Remoortere, Joseph Wu, and Till Toenshoff. > > > Bugs: MESOS-830 > https://issues.apache.org/jira/browse/MESOS-830 > > > Repository: mesos > > > Description > ------- > > Fix flaky ExamplesTest.JavaFramework > > > Diffs > ----- > > src/examples/java/TestFramework.java > 265005b9d1a6a053b812c133a4817c3d9e61179e > > Diff: https://reviews.apache.org/r/37415/diff/ > > > Testing > ------- > > This test may fail only infrequently, and the bug is made worse by verbose > logging, so to verify correctness (or reproduce the bug before patching) try: > > GLOG_v=2 GTEST_FILTER="ExamplesTest.JavaFramework" bin/mesos-tests.sh > --verbose --gtest_repeat=100 2>/dev/null | grep "\[" > > > Thanks, > > Greg Mann > >