Hi all,
I recently discovered a race condition when asynchronous exceptions, i.e.,
exceptions raised by signal handlers, are used with the DelimCC library. After
consulting with Oleg Kiselyov, he realized that the problem with asynchronous
exceptions is far more pervasive than I had originally thought. For example,
the following program, using only the standard OCaml library, leads to a
segfault:
> let q = Queue.create ();;
>
> let () =
> let en = ref false in
> Sys.set_signal Sys.sigalrm (Sys.Signal_handle (fun _ -> if !en then raise
> Exit));
> ignore (Unix.setitimer Unix.ITIMER_REAL { Unix.it_interval=1e-6;
> Unix.it_value=1e-3 });
> while true do
> try
> en := true;
> Queue.add "a" q;
> en := false;
> ignore (Queue.pop q)
> with Exit ->
> en := false;
> if Queue.length q > 0 then prerr_endline "Non-empty";
> assert (let len = Queue.length q in len = 0 || len = 1);
> Queue.iter print_string q;
> Queue.clear q;
> done
The problem occurs in Queue.ml:
> let create () = {
> length = 0;
> tail = Obj.magic None
> }
>
> let add x q =
> q.length <- q.length + 1;
> (* asynchronous exception occurs here *)
> ...
When Queue.add is interrupted by an exception immediately after the length is
updated, the length becomes inconsistent with the tail, and subsequent
operations such as Queue.iter will attempt to operate on Obj.magic None,
leading to the segfault.
Probably much of the standard library is similarly susceptible to such races,
particularly the parts that operate on mutable data. These mostly lead to less
dramatic consequences than segfaults, but are still seemingly random errors
such as corrupted data or violated invariants.
For DelimCC, my solution was to provide a pair of C functions, mask_signals and
unmask_signals, to bracket operations that are unsafe under asynchronous
exceptions, using sigprocmask to suppress signal handling between them.
However, since this requires access to OCaml's signal handling internals
(caml_process_pending_signals and caml_signals_are_pending) to flush pending
signals, it would be great to have mask_signals and unmask_signals or something
similar in the standard library, so as to make it easier to develop libraries
that are safe under asynchronous exceptions. (For reference, Haskell has seen
this and other issues with asynchronous exception and implemented a similar
solution, see http://hackage.haskell.org/trac/ghc/ticket/1036.)
Another option would be to simply warn against or disallow signal handlers that
raise exceptions, but that seems less useful, e.g., it would make it hard to
interrupt a long-running library function with a timeout.
We look forward to hearing what the official fix or guideline for handling
asynchronous exception will be, for the standard library, third-party
libraries, as well as applications.
Thank you,
Yit
July 5, 2011
--
Caml-list mailing list. Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs