Hi all,

I recently discovered a race condition when asynchronous exceptions, i.e., 
exceptions raised by signal handlers, are used with the DelimCC library. After 
consulting with Oleg Kiselyov, he realized that the problem with asynchronous 
exceptions is far more pervasive than I had originally thought. For example, 
the following program, using only the standard OCaml library, leads to a 
segfault:

> let q = Queue.create ();;
> 
> let () =
>     let en = ref false in
>     Sys.set_signal Sys.sigalrm (Sys.Signal_handle (fun _ -> if !en then raise 
> Exit));
>     ignore (Unix.setitimer Unix.ITIMER_REAL { Unix.it_interval=1e-6; 
> Unix.it_value=1e-3 });
>     while true do
>         try
>             en := true;
>             Queue.add "a" q;
>             en := false;
>             ignore (Queue.pop q)
>         with Exit ->
>             en := false;
>             if Queue.length q > 0 then prerr_endline "Non-empty";
>             assert (let len = Queue.length q in len = 0 || len = 1);
>             Queue.iter print_string q;
>             Queue.clear q;
>     done

The problem occurs in Queue.ml:

> let create () = {
>   length = 0;
>   tail = Obj.magic None
> }
> 
> let add x q =
>   q.length <- q.length + 1;
>   (* asynchronous exception occurs here *)
>   ...

When Queue.add is interrupted by an exception immediately after the length is 
updated, the length becomes inconsistent with the tail, and subsequent 
operations such as Queue.iter will attempt to operate on Obj.magic None, 
leading to the segfault.

Probably much of the standard library is similarly susceptible to such races, 
particularly the parts that operate on mutable data. These mostly lead to less 
dramatic consequences than segfaults, but are still seemingly random errors 
such as corrupted data or violated invariants.

For DelimCC, my solution was to provide a pair of C functions, mask_signals and 
unmask_signals, to bracket operations that are unsafe under asynchronous 
exceptions, using sigprocmask to suppress signal handling between them. 
However, since this requires access to OCaml's signal handling internals 
(caml_process_pending_signals and caml_signals_are_pending) to flush pending 
signals, it would be great to have mask_signals and unmask_signals or something 
similar in the standard library, so as to make it easier to develop libraries 
that are safe under asynchronous exceptions. (For reference, Haskell has seen 
this and other issues with asynchronous exception and implemented a similar 
solution, see http://hackage.haskell.org/trac/ghc/ticket/1036.)

Another option would be to simply warn against or disallow signal handlers that 
raise exceptions, but that seems less useful, e.g., it would make it hard to 
interrupt a long-running library function with a timeout.

We look forward to hearing what the official fix or guideline for handling 
asynchronous exception will be, for the standard library, third-party 
libraries, as well as applications.


Thank you,

Yit
July 5, 2011
-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Reply via email to