i have several notes about concurrent tests: 1. threaded-idx-access and provoke-deadlock are essentially same -- they change slot values of zork instances. there are some differences: threaded-idx-access runs multiple test batches, provoke-deadlock works in one batch. provoke-deadlock runs offending number of threads -- 30, instead of 5 in theaded-idx-acess. provoke-deadlock concentrates on first object, while threaded-idx-access works on all of them.
i do not think any of these differences are essential. having multiple slightly different tests hoping they will spot errors is like "voodoo programming", i think we should delete provoke-deadlock. or rewrite it in a way it will be really different. (classic way to provoke deadlock is to do criss-cross slot updates). 2. provoke-deadlock does not wait until threads are finished before it wipes classes and closes controller. obviously, this causes weird errors. is it intentianal behaviour? 3. threaded-idx-access joins to all threads it finds. i think it's not OK because there might be threads spawned by SLIME etc. it's better to collect threads created and join to those ones we created, i.e. (mapcar #'sb-thread:join-thread (loop for i from 1 to 5 collect (let ((i i)) (bt:make-thread (lambda () ...))) also note (let ((i i)) -- this is essential, otherwise all closures will refer to same i binding. 4. join-thread seems to be SBCL-specific. can't we find some thread-synch primitive that will work on all implementations? (i haven't yet looked into this). 5. threaded-idx-access is not really an automatic test -- if something gets broken, it lands into a debugger. i think that there should be something like handler-case inside each thread, and if error happens inside the thread, it should report it to main thread, that will re-throw it, or something. also, it's worth checking that operations were correct -- classes are accessible via index, slot values are correct etc. now, about test runs. indeed, it yields lots of _different_ deadlocks in isolation mode "read commited". most bizzare one i've seen was result of interaction of four(!) threads, however there were ones with mere two participants. OTOH results of running this with serializable isolation mode are quite consistent: WARNING: Error while executing prepared statement "TREE12DELETE-BOTH" (params: (0 433)). retrying txn due to:Database error 40001: could not serialize access due to concurrent update conclusions are clear to me: use serializable isolation mode FTW. it's not worth trying to fix working in "read committed" mode, because there is no logical grounds for it to be working fine when data can be changed at any time and it not consistent within a single transaction. i think it's better to completely ban "read committed" mode, but i can make this configurable for people with masochistic intentions :). as i have ideas how to "improve" concurrent test suite, probably it would be better if i'll just take over the suite and implement them. however, i'm open about other ideas about future of this test suite :). also i have a question unrelated to test suite: it says me that thread-alive-p used in reap-orphaned-connections in db-postmodern is not defined. where should this function come from, newer version of bordeax threads (perhaps i have an outdated package)? _______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel