The following JIRA was created to track the problem http://issues.apache.org/jira/browse/HARMONY-2648
Thanks Evgueni On 12/12/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:
Nikolay, I just tried to separate shutdown related from safe-point callback stuff. No problems were identified so far. So I will file a separate JIRA for callback related things. On 12/12/06, Nikolay Kuznetsov <[EMAIL PROTECTED]> wrote: > Evgueni, > > sorry for intrusion, at the very last moment, in general I agree with > all mentioned problems and appreciate your work done for shutdown and > suspension enhancements but if you don't mind I have I have few > questions or suggestions: > > 1) Am I right, that all these scenarios came from shutdown sequence, > I'm asking because, I would discuss some points but by no means I > don't want to stop your shutdown patch which is good in general, but > if those changes can go to the code base separately it would be > preferable. > > 2) May I ask you to separate safepoint_callback related fixes into > separate JIRA; There were several suggestions which came from Weldon > to improve safepoint callback mechanism which we can incorporate into > single patch and also it would help to review those changes faster and > easier (if it's not so hard, of 'cause). > > 3) Third problem in your list is a tricky one, it leads to separating > suspend into user requested and system wide, because first one should > be interrupted by stop (according to RI) and second one seems to be > stop proof. Yes, I was thinking about that as well. It would be nice if we can find another solution which is easier than differentiation between "internal" and "external" requests. > Personally I prefer to exclude this test until the problem > is resolved and check your changes in for the sake of safety. Me too. > By the > way, does shutdown sequence workaround suspended threads correctly? No. In case of shutdown we can't even reset suspend_request counter in the callback function because it is external component to TM. > And I would also ask Ivan Popov if such a scenario may be reserved for > JDWP? > > Thank you. > Nik. > Thanks Evgueni > On 12/11/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: > > Hi All, > > > > While working on http://issues.apache.org/jira/browse/HARMONY-2391 I > > found out that current implementation of safe-point callbacks works > > incorrectly/unsafely. Here is a list of problems: > > > > 1) hythread_suspend & hythread_resume are used asynchronously what may > > lead to a deadlock. Here is the scenario: > > a) T1 wants to set a safe-point callback to T2. So it calls > > hythread_set_safe_point_callback(T2, the_callback). > > b) T1 sets T2->safepoint_callback to the_callback; > > c) T2 executes hythread_resume(self) which is under if > > (thread->safepoint_callback) statement. > > d) T1 calls send_suspend_request(T2). So T2 will never be resumed > > because it already invoked corresponding hythread_resume(self) > > statement on the previous step. > > > > 2) Current implementation sets suspend_disable_count to 1 > > (thread_native_suspend.c:162) and allows execution of unsafe code > > (which must be executed under suspend disabled state) while GC may be > > working..... > > > > 3) In stop_callback (thread_java_basic.c:413) suspend_request for the > > current thread is set to zero. So the thread just ignores > > suspend_requests and continue to do its dirty things. > > > > All these problems are fixed in the HARMONY-2391. But > > org.apache.harmony.luni.tests.java.lang.ThreadGroupTest.test_suspend > > starts to fail. The scenario is as follows: > > > > 1) T1 suspends T2 (so the suspend_request is >0). > > 2) T1 stops T2 by setting up the stop_callback to T2. > > 3) T2 needs to switch to suspend disabled state to be able to throw > > exception but it can't do that because of step 1) > > > > From the one hand, If I rollback my changes for the 3) problem > > described in the first list the this test works fine. From the other > > hand, I can't do it because it may lead to a crash if GC is really > > working at the time I want to throw an exception. I don't see how to > > fix it in an easy way right now.... but want to proceed with > > HARMONY-2391. > > > > What would you recommend to do? > > 1) Commit the HARMONY-2391 patch as is. File a JIRA regarding failing > > case. Exclude the test until the bug is fixed. > > 2) Commit the HARMONY-2391 patch with 3) undone. File a JIRA. In this > > case it is possible to have intermittent failures until the problem is > > not fixed. > > > > Thanks > > Evgueni > > >
