The code in queue.ml should be rewritten as:
let add x q =
if q.length = 0 then
let rec cell = {
content = x;
next = cell
} in
q.length <- q.length + 1;
q.tail <- cell
else
let tail = q.tail in
let head = tail.next in
let cell = {
content = x;
next = head
} in
q.length <- q.length + 1;
tail.next <- cell;
q.tail <- cellso that the allocation would be performed before actually mutating the data structure. Consequently, the signal handler (executed at the allocation points) would not break the invariants of the queue. I reported this morning the problem into the bug tracker: http://caml.inria.fr/mantis/view.php?id=5309 Fabrice PS: while waiting for the next bug fix release, you should copy and modify the queue module for yourself... On 07/06/2011 10:17 AM, Mark Shinwell wrote: > On Tue, Jul 05, 2011 at 11:26:52PM -0400, Khoo Yit Phang wrote: >> The problem occurs in Queue.ml: >> >>> let create () = { >>> length = 0; >>> tail = Obj.magic None >>> } >>> >>> let add x q = >>> q.length <- q.length + 1; >>> (* asynchronous exception occurs here *) >>> ... > [snip] >> We look forward to hearing what the official fix or guideline for handling >> asynchronous exception will be, for the standard library, third-party >> libraries, as well as applications. > > I'm presuming this is native code. Are you compiling with or without the > Caml threads library? > > If using the threads library, reception of a signal will cause it to be noted > down, and the user-defined signal handler written in Caml will be executed > later. "Later" is difficult to pin down in words, but it should be the case > that if you have a section of code which does not involve any allocation nor > calls to any functions which drop the runtime lock, then it will never be > interrupted by the execution of a signal handler. Indeed, it will also never > be interrupted by a context switch between Caml threads, so you can get > thread safety guarantees this way. As far as user code goes, I believe the > things you have to think about in the two cases of raising an exception from > a signal handler and experiencing a context switch between Caml threads are > the same. > > Establishing whether a particular section of code is atomic in this > regard---let us just say "thread safe"---might involve reading the assembly > code. (I've wondered in the past about a construct which could be used to > indicate that a section is required to be atomic, and the compiler would check > it.) I think it is clear that [Queue.add] is not thread safe. The reason is > due to the two record allocations in the code, one of which is: > > let add x q = > q.length <- q.length + 1; > <------- signal happens here > if q.length = 1 then > let rec cell = > <------- Caml signal handler executed here > since the record allocation might > trigger a garbage collection > { > content = x; > next = cell > ... > > It isn't clear to me that it is reasonable to assume all libraries are thread > safe. Perhaps the documentation needs improving in this area. Specifically > in the case of signal handlers, I would recommend restricting processing in > them to an absolute minimum, and in particular not throwing exceptions. The > code will be easier to think about that way. If you must throw an exception, > you have to ensure that any data structure which you rely on in the exception > handler is Caml thread safe, in order that it is not caught in an inconsistent > state. > > If you're not using the threads library then I believe the signal handler > could > be executed immediately upon reception of the signal in certain cases (for > example if you're calling a blocking syscall); and if you're running Caml code > at the time of the signal, then it should be queued as above. As such, in > this > scenario, you will need to be even more careful about what you do in the > signal > handler. Further, you also have to take into account that if your handler is > immediately executed then you are still inside the genuine operating system > signal handler---which means you've got the C library in an arbitrary state. > You need to restrict yourself to async signal safe glibc (or equivalent) calls > in that scenario (cf. the signal(2) manual page on Linux) and make sure that > the runtime doesn't call any other C library functions as a result of your > signal handling code. I recommend avoiding that can of worms entirely by > doing > as little as possible in the handler. > > Mark > -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
<<attachment: fabrice_le_fessant.vcf>>
