Hi Andrej, On Sun, Jun 24, 2012 at 9:51 PM, Andrej Podzimek <[email protected]> wrote: >> Hi Andrej, >> >> Thank you very much for taking the time to look at my >> comments regarding your algorithm. >> >> I am afraid I still do not understand how the race can >> be prevented. I am probably stuck and -- sorry to say >> this -- you are most likely the only one who can get me >> unstuck ;-). I would be grateful if you had a look at this >> email as well and pointed out my logical error. >> > > Hi Adam, > > After checking your example much more carefully, including the source code, > I have the impression that you are perfectly right. I would like to > apologize for my statement of the form "if there was a bug of this > magnitude, we would have already observed it". The way I see it right now, > there *is* a serious bug. (And the fact that it hasn't been observed yet > doesn't matter at all.) > > I remember I got stuck by similar thoughts quite many times in the past, but > unfortunately, two years have elapsed and many thoughts (and also notes, > unfortunately) are now irreversibly lost. :-( Nevertheless, right now I am > convinced that the race condition you point out ("a callback enqueued > *after* GP detection and *before* the subsequent reclamation can be invoked > prematurely with respect to a concurrent reader") *can* really occur.
Thank you very much for looking at the issue in greater detail. > This leads to an obvious question: How is it possible that our tests and > benchmarks did not produce a flood of error messages or crash immediately? > Reading through the sources again with this particular scenario in mind > yielded a possible answer: The problem is so unlikely to occur with our > specific workload that it simply never occurred so far. > > Our most thorough test that swaps two buffers repeatedly only works with the > *blocking* API, which is unaffected by this problem. Although the callback > API is used heavily throughout our hash table test/benchmark combination, > the way the hash table keys are accessed (in a rather unpredictable fashion > from *preemptible* kernel threads) probably makes the race so unlikely that > it has *never* been revealed yet. > > Furthermore, the reclaimers run under a very high priority. (They may be > preempted from time to time due to their SYSDC scheduling class, but only > extremely long callback queues could make this happen.) This fact *hides* > the bug even deeper, making a scenario where a reclaimer is signaled by the > detector *and* does *not* wake up almost immediately (providing room for > callback B from your pseudocode to be enqueued and then moved to cur_cbs > prematurely) highly improbable. Yeah, race conditions are improbable. Unfortunately, they have a tendency to bite when you least expect it :-(. > The thoughts above might (hopefully) explain why we haven't seen the > pathological scenario so far. > > Consequently, unless someone comes up with serious reasoning about the > contrary, this problem will have to be fixed somehow: > > A) By swapping the waiting and signaling phases of the detector, as > you suggested... But I have a strong impression that this might fail to > exploit all the possible parallelism among the reclaimers and the detector. > > B) By using three lists of callbacks instead of two, with the obvious > intention of delaying the callbacks one GP boundary further and avoiding the > pathological scenario. This would be nothing new -- some RCU algorithms for > Linux had three (or even more) queues of callbacks for similar reasons. Such > a solution would make it possible to leave the blocking API implementation > unchanged and only make a small modification to the reclaimers. > > It would be nice if a third person (Martin?) could have a careful look at > this. I have just got too confused to think about it any further. > > Well done, Adam! Thanks a lot for your time and effort, Andrej. I really appreciate it. Adam _______________________________________________ HelenOS-devel mailing list [email protected] http://lists.modry.cz/cgi-bin/listinfo/helenos-devel
