On Fri, Mar 19, 2010 at 8:41 AM, Robert Dionne <[email protected]> wrote: > >> >> I got the error included below this morning, and when I ran it again, there >> was no error. > >> /tmp/couchdb/0.11.0/test/etap/090-task-status.................ok >> /tmp/couchdb/0.11.0/test/etap/100-ref-counter.................FAILED test 8 > > I looked into this random fail a bit and it may be a real issue to consider. > couch_ref_counter uses process_info in the count function to compute the > number of pids referring to the given one, rather than interrogating the > number of referrers being maintained in the state of the gen_servers. This > diff could explain the apparent race condition that caused this fail (which > doesn't reproduce on my box). From the who_calls trace it looks ok and I > surmise the reason process_info is used is to handle the case where a Pid > dies in the forest and no one accounts for it, throwing off the count, > whereas process_info presumably never lies. > >
There was a possibility of a race condition in the test code. Seeing as this is the first time I've heard of it being hit I reckon it must be a fairly hard one to trigger. Anyway, new code should alleviate the issue, and it was an error in the test itself, not the code so all is well. http://svn.apache.org/viewvc?revision=925264&view=revision And a pretty diagram of the issue: http://twitpic.com/19jk7a The solution is to just make the Test process wait for a down message that'd be triggered at the same time. HTH, Paul Davis
