[ http://issues.apache.org/jira/browse/DERBY-989?page=comments#action_12419163 ]
Knut Anders Hatlen commented on DERBY-989: ------------------------------------------ Thank you Andreas and Bryan for reviewing the patch. Bryan wrote: > 1) I don't really see why unsubscribe needs to block if it discovers > that the service record being unsubscribed is currently performing > work. Is this a necessary element of the fix, or just a side effect > of the implementation? It is a necessary element of the fix. The test case that fails does the following (everything not related to the failure stripped away): 1 subscribe client 2 unsubscribe client 3 save the number of times the client has been serviced 4 do some other testing 5 check that the number of times the client has been serviced has not increased If unsubscribe() returns while the client is being serviced, parts of performWork() might be executed after 3 and before 5. Since performWork() is updating the number of times the client has been serviced, the check in 5 fails. The only way we can guarantee that performWork() is not running after 3 is by waiting in unsubscribe(). However, my fix assumes that the test is correct. I don't see anything in the code/comments saying that performWork() should never be run after unsubscribe() has returned. Actually, the javadoc in BasicDaemon says Furthermore, any Serviceable subscriptions, including onDemandOnly, must tolerate spurrious services. This could mean that the current behaviour is as expected/designed and the test should be changed. It should be as easy as changing point 5 to "check that the number of times the client has been serviced has not increased by more than one". Does anyone have opinions on whether we should fix the Derby code or the test? I'm fine with either one of them. > 2) I think that it is dangerous for unsubscribe to ignore interrupts > while waiting for the service record to complete its work. I think > that if unsubscribe gets an interrupt at this time, it should > abandon the wait and return, or it should throw an exception. I agree. We can't throw an exception because it would break the contract of the DaemonService interface, but I think it's reasonable to abandon the wait and return. > 3) Introducing the "client number" concept into the service record > seems awkward, as it doesn't seem like a natural part of the service > record implementation. Also, comparing client numbers as a way to > answer the question "is this service record already unsubscribed" > seems somewhat convoluted. Here are two alternate ideas that > occurred to me: > - perhaps we could refer to the new Service Record field as a > "unique id", and have an "equals" method to compare two > ServiceRecord objects, so that we don't so directly couple the > Client Number concept from Basic Daemon into the ServiceRecord > class. I don't think we need to have decoupling of ServiceRecord from BasicDaemon as a goal. ServiceRecord is a package protected class and its only purpose is to aid the implementation of BasicDaemon. It is purely an implementation detail of BasicDaemon, and it is not related to the DaemonService interface in any way. > - perhaps there should be an "subscribed" boolean flag in Service > Record, with methods "isSubscribed" and "setSubscribed" so that > when Basic Daemon wants to ask if a ServiceRecord is already > unsubscribed, it could do so directly rather then by fetching the > client number and testing it against -1 and so forth. I agree that this sounds cleaner. Thanks for the tip! > 4) It seems a little odd that BasicDaemon sometimes synchronizes on > the "this" object, and sometimes depends on the fact that > "subscription" is a Vector and thus is inherently synchronized. I'm > not saying that anything is necessarily wrong here, it was just a > red flag to me that there were two levels of synchronization being > used in the BasicDaemon methods. I'm not a fan of this either. However, in this case it seems like the vector synchronization and the "this" synchronization guard different parts of the state. Synchronization on "this" ensures that the state controlling the wait() loops is consistent, and the vector synchronization ensures that concurrent modifications of the vector don't interfere with each other. There is probably a cleaner way to implement BasicDaemon, but that would be outside the scope of this patch, I think. Thanks for all your comments! I will wait for comments on my question about whether this is a test issue or a code issue before submitting a new patch. > unit/daemonService.unit fails intermittently: 'ran out of time' > --------------------------------------------------------------- > > Key: DERBY-989 > URL: http://issues.apache.org/jira/browse/DERBY-989 > Project: Derby > Type: Test > Components: Regression Test Failure > Versions: 10.2.0.0 > Environment: OS: Solaris 10 3/05 s10_74L2a X86 - SunOS 5.10 Generic, JVM: > Sun Microsystems Inc. 1.5.0_04 > OS: Solaris 9 9/04 s9s_u7wos_09 SPARC - SunOS 5.9 Generic_118558-11, JVM: Sun > Microsystems Inc. 1.5.0_03 > OS: Red Hat Enterprise Linux AS release 3 (Taroon Update 4) - Linux > 2.4.21-27.ELsmp #1 SMP Wed Dec 1 21:50:31 EST 2004 GNU/Linux, JVM: Sun > Microsystems Inc. 1.5.0_03 > Reporter: Ole Solberg > Assignee: Knut Anders Hatlen > Priority: Minor > Attachments: 989-411220-derbyall_derbyall_unit_unit_derby.log, > derby-989-timebomb.diff, derby-989-timebomb.stat, derby-989-v1.diff, > derby-989-v1.stat > > "Signature": > ********* Diff file unit/unit/daemonService.diff > *** Start: daemonService jdk1.5.0_04 unit:unit 2006-02-14 20:46:42 *** > 2 del > < -- Unit Test T_DaemonService finished > 2 add > > ran out of time > > Shutting down due to unit test failure. > > Exit due to time bomb > Test Failed. > *** End: daemonService jdk1.5.0_04 unit:unit 2006-02-14 21:47:13 *** > http://www.multinet.no/~solberg/public/Apache/Derby/Limited/testSummary-377800.html > [SunOS-5.10 i86pc-i386] -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira