> On Aug. 13, 2015, 4:22 p.m., Till Toenshoff wrote:
> > src/examples/java/TestFramework.java, line 275
> > <https://reviews.apache.org/r/37415/diff/1/?file=1038842#file1038842line275>
> >
> >     Why 500 - not 200 or 1000? If I am not mistaken, then using 500 worked 
> > fine on your test-machines but chances are that someone comes in with a 
> > heavily loaded machine where those 500ms might not do, correct? Let us be 
> > honest and let the audience know that this is just a value chosen out of 
> > some test runs done - again, if I did not miss something :)
> >     
> >     I think it would also be a great idea to add a todo here that captures 
> > what you wrote within the JIRA issue already;
> >     "TODO(*): Ideally, we would inspect the status of the driver and its 
> > associated tasks via the Java API and wait until their teardown is complete 
> > to exit."
> 
> Greg Mann wrote:
>     You are correct! :-) Thanks, comments addressed.
> 
> Joseph Wu wrote:
>     Agreed.  I don't think a sleep if a good enough solution.
>     
>     (Note: This next part is just supposition.)
>     
>     If you take a look at the comment here:
>     
> https://github.com/apache/mesos/blob/437e7018e184128c9ec7c4113da09bbf9914721f/src/java/jni/org_apache_mesos_MesosSchedulerDriver.cpp#L485
>     
>     You'll notice that the SchedulerDriver is instantiated as a weak 
> reference.  So the GC is relied upon to destroy the object.
>     See: 
> https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#weak
>     
>     Have you tried changing it to a normal global reference and adding the 
> explicit cleanup code into the `stop` function?
> 
> Greg Mann wrote:
>     Yea, I'm not crazy about the sleep solution either. I just tried your 
> suggestion, Joseph, putting a DeleteGlobalRef() call in the stop() method, 
> and it didn't work unfortunately. I think that short of an explicit 
> confirmation message upon teardown completion (which doesn't currently exist, 
> and is perhaps not advisable anyway), I don't see a way of resolving this in 
> an elegant fashion. Even if we're sure that the SchedulerDriver is gone 
> before we call System.exit(), the associated Executors and the Slave are 
> doing logging as they shut down, and it's the premature destruction of the 
> mutexes used by glog that is causing the problem.

Well, @joseph - the comment you point to, pretty much negates that option, 
doesn't it?

    // Create a weak global reference to the MesosSchedulerDriver
    // instance (we want a global reference so the GC doesn't collect
    // the instance but we make it weak so the JVM can exit).
  
Bearing in mind that this is test code, which is only run at exit (ie, one-off 
and not in a performance-critical section) I think that adding a sleep is 
acceptable (if not exciting).

Sure, a callback that says "hey, I'm done now, you can go away" would be great, 
but I'd guess, probably overkill.

Totally +1 to the comment and the TODO - hard-coded 'magic numbers" are A Bad 
Thing, so definitely worth going the extra mile and explaining them.


- Marco


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37415/#review95264
-----------------------------------------------------------


On Aug. 13, 2015, 4:49 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37415/
> -----------------------------------------------------------
> 
> (Updated Aug. 13, 2015, 4:49 p.m.)
> 
> 
> Review request for mesos, Joris Van Remoortere, Joseph Wu, and Till Toenshoff.
> 
> 
> Bugs: MESOS-830
>     https://issues.apache.org/jira/browse/MESOS-830
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Fix flaky ExamplesTest.JavaFramework
> 
> 
> Diffs
> -----
> 
>   src/examples/java/TestFramework.java 
> 265005b9d1a6a053b812c133a4817c3d9e61179e 
> 
> Diff: https://reviews.apache.org/r/37415/diff/
> 
> 
> Testing
> -------
> 
> This test may fail only infrequently, and the bug is made worse by verbose 
> logging, so to verify correctness (or reproduce the bug before patching) try:
> 
> GLOG_v=2 GTEST_FILTER="ExamplesTest.JavaFramework" bin/mesos-tests.sh 
> --verbose --gtest_repeat=100 2>/dev/null | grep "\["
> 
> 
> Thanks,
> 
> Greg Mann
> 
>

Reply via email to