Hi Andrej,

On Sun, Jun 24, 2012 at 9:51 PM, Andrej Podzimek
<[email protected]> wrote:
>> Hi Andrej,
>>
>> Thank you very much for taking the time to look at my
>> comments regarding your algorithm.
>>
>> I am afraid I still do not understand how the race can
>> be prevented. I am probably stuck and -- sorry to say
>> this -- you are most likely the only one who can get me
>> unstuck ;-). I would be grateful if you had a look at this
>> email as well and pointed out my logical error.
>>
>
> Hi Adam,
>
> After checking your example much more carefully, including the source code,
> I have the impression that you are perfectly right. I would like to
> apologize for my statement of the form "if there was a bug of this
> magnitude, we would have already observed it". The way I see it right now,
> there *is* a serious bug. (And the fact that it hasn't been observed yet
> doesn't matter at all.)
>
> I remember I got stuck by similar thoughts quite many times in the past, but
> unfortunately, two years have elapsed and many thoughts (and also notes,
> unfortunately) are now irreversibly lost. :-( Nevertheless, right now I am
> convinced that the race condition you point out ("a callback enqueued
> *after* GP detection and *before* the subsequent reclamation can be invoked
> prematurely with respect to a concurrent reader") *can* really occur.

Thank you very much for looking at the issue in greater
detail.

> This leads to an obvious question: How is it possible that our tests and
> benchmarks did not produce a flood of error messages or crash immediately?
> Reading through the sources again with this particular scenario in mind
> yielded a possible answer: The problem is so unlikely to occur with our
> specific workload that it simply never occurred so far.
>
> Our most thorough test that swaps two buffers repeatedly only works with the
> *blocking* API, which is unaffected by this problem. Although the callback
> API is used heavily throughout our hash table test/benchmark combination,
> the way the hash table keys are accessed (in a rather unpredictable fashion
> from *preemptible* kernel threads) probably makes the race so unlikely that
> it has *never* been revealed yet.
>
> Furthermore, the reclaimers run under a very high priority. (They may be
> preempted from time to time due to their SYSDC scheduling class, but only
> extremely long callback queues could make this happen.) This fact *hides*
> the bug even deeper, making a scenario where a reclaimer is signaled by the
> detector *and* does *not* wake up almost immediately (providing room for
> callback B from your pseudocode to be enqueued and then moved to cur_cbs
> prematurely) highly improbable.

Yeah, race conditions are improbable. Unfortunately, they
have a tendency to bite when you least expect it :-(.

> The thoughts above might (hopefully) explain why we haven't seen the
> pathological scenario so far.
>
> Consequently, unless someone comes up with serious reasoning about the
> contrary, this problem will have to be fixed somehow:
>
>        A) By swapping the waiting and signaling phases of the detector, as
> you suggested... But I have a strong impression that this might fail to
> exploit all the possible parallelism among the reclaimers and the detector.
>
>        B) By using three lists of callbacks instead of two, with the obvious
> intention of delaying the callbacks one GP boundary further and avoiding the
> pathological scenario. This would be nothing new -- some RCU algorithms for
> Linux had three (or even more) queues of callbacks for similar reasons. Such
> a solution would make it possible to leave the blocking API implementation
> unchanged and only make a small modification to the reclaimers.
>
> It would be nice if a third person (Martin?) could have a careful look at
> this. I have just got too confused to think about it any further.
>
> Well done, Adam!

Thanks a lot for your time and effort, Andrej. I really
appreciate it.

Adam

_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/cgi-bin/listinfo/helenos-devel

Reply via email to