Simple thread-safety demo

Ken Wesson Fri, 20 May 2011 05:58:50 -0700

This short program creates a polymorphic protocol that provides a
unified interface for reading and for applying a function to mutable
objects and extends it to refs and atoms while creating a Cell type
that amounts to a volatile mutable variable. (look thingie) will
return the value in all three cases and (mutate! thingie f args) does
analogous to (swap! an-atom f args) in all three cases.


Then it builds a function rand-xfers-conc! that creates X mutable
objects of type Y, then starts Z threads that each perform W random
transactions on random pairs of the mutables, then waits for the
threads to finish and reports statistics on the outcome.

In the process, it demonstrates protocols and protocol extension for
abstraction, a simple mutable deftype, some concurrency tricks for
making and awaiting large thread pools, gambler's ruin, AND the
differing thread-safety and performance characteristics of volatile
mutables, atoms, and refs.

The mutables are created with integer values and the transactions
consist of "withdrawing" a random amount from one "account" and
"depositing" it in the other, with overdrawing disallowed, a
per-transaction transfer maximum of 10000, and a short delay between
determining the amount and performing the transfer (to allow time for
other transactions involving one of the same cells to be fairly
likely, but not so likely as to make the case with refs take forever).

Here are the results from one run with each type of mutable:

=> (rand-xfers-conc! #(ref 100) 100 100 1000)
init-sum = 10000
final-sum = 10000
final-minimum = 0
nil
=> (rand-xfers-conc! #(atom 100) 100 100 1000)
init-sum = 10000
final-sum = 10000
final-minimum = -330128
nil
=> (rand-xfers-conc! #(Cell. 100) 100 100 1000)
init-sum = 10000
final-sum = 14632
final-minimum = -211252
nil

Each run started the "accounts" with a "balance" of 100 and each
created 100 of these and then ran 1000 transactions in each of 100
simultaneous threads.

With refs, the initial and final sum agree and no account becomes
overdrawn. Transactional integrity is perfect.

With atoms, the initial and final sum still agree -- each transaction
adds an amount to one account and removes the same amount from
another, and atom's swap! guarantees that there won't be any data
races to updates to any individual atom, so none of the separate
withdrawals and deposits can "go missing", and since every withdrawal
is paired with a matching deposit the total is invariant after all
transactions have completed.

However, accounts end up overdrawn, with the worst ending up over
three hundred grand in the hole! What happened here? Atoms only
protect against data races involving concurrent updates of the same
atom. Nothing stops an account with $130 being seen by rand-xfer! as
able to lose up to $130, so rand-xfer! decides to withdraw $70 from
it, then the account having $90 withdrawn by another thread, leaving
$40, and then rand-xfer! yanking out the $70 to leave it overdrawn by
thirty bucks, for instance.

With Cells, the results are worse by far: the final sum shows almost
five thousand bucks created out of thin air, as *well* as accounts
overdrawn by as much as two hundred grand. If banks had this poor
transactional integrity in their computers, we'd have skyrocketing
inflation.

The cause here is the blatant data race between two updates to the
same volatile: x is 40, thread a reads it as 40, thread b reads it as
40, thread a adds 30 to get 70, thread b subtracts 20 to get 20,
thread a writes 70 to x, and thread b writes 20 to x. Thread a's
addition of 40 to x "went missing" as a result of this race condition,
but may still have been withdrawn successfully from some other
account. Withdrawals can go missing too, and in this instance some
must have since the net change in the total balance was positive.

With atoms, the additions of 70 and -20 are done atomically (hence the
name "atom") so the read, addition, and subsequent write cannot get
interleaved with a concurrent swap! in another thread, but Cells offer
no such protection.

Performance-wise, the Cells case takes a few seconds to execute on my
hardware, the atoms case maybe a smidgen longer, and the refs case
about half a minute. Refs perform significantly more slowly with the
high level of contention artificially created by putting that 10ms
delay in rand-xfer! because there end up being a lot of transaction
retries -- one for every case where Cells and atoms generate an
overdrawn account, and then some. However, slow and correct beats fast
and wrong any day, so for transactions like these refs are absolutely
the best choice -- nay, the only correct choice (bar explicit locking
-- but see below).

Without the 10ms delay in rand-xfer!, the atoms case usually produces
a nonnegative final minimum balance, but the data race still exists;
the error rate is just too low for overdrafts to occur very often. But
in a real-world system, even data races allowing invariants like
(not-any? neg? balances) to be violated only at the low rate of one
error a day are terrible, evil things.

As an added bonus, gambler's ruin is demonstrated. The refs outcome
almost *always* has the minimum account balance end up zero, even
though the odds are not biased in favor of any account over any other.
The thing is, once you run out of money, that's it -- game over.
Unless the house is using atoms or Cells to keep track of the chips,
in which case you can end up losing your house in the bargain. :)

Modifying rand-xfer! to replace (dosync ...) with (locking x1 (locking
x2 ...)) and rerunning the Cells case will wedge your repl,
demonstrating the evils of relying on fine-grained locking instead of
STM: deadlock. By contrast, transactions cannot even livelock; every
time a transaction fails it's because another transaction succeeded
somewhere else to make it fail, so some progress must have been made.
In an open-ended run such as a live banking system, the subtler risk
of a particular account's transaction stream being starved (always
failing at the expense of other transactions involving the other
accounts involved in transfers to that account) may exist, but if
threads have equal priority it's statistically expected for all of the
threads to make progress at roughly equal rates, and thus for accounts
to be updated in roughly as timely a fashion as one another.


(defprotocol Xfer
  (look [this])
  (mutate [this f args]))

(defn mutate! [x f & args]
  (mutate x f args))

(extend-type clojure.lang.Ref
  Xfer
  (look [this] @this)
  (mutate [this f args]
    (apply alter this f args)))

(extend-type clojure.lang.Atom
  Xfer
  (look [this] @this)
  (mutate [this f args]
    (apply swap! this f args)))

(deftype Cell [^{:volatile-mutable true} x]
  Xfer
  (look [this] x)
  (mutate [this f args]
    (set! x (apply f x args))))

(defn abs [x]
  (if (neg? x) (- x) x))

(defn rand-xfer! [x1 x2]
  (dosync
    (let [rng (rem (abs (min (look x1) (look x2))) (int 1e4))
          amt (- (rand-int (+ 1 rng rng)) rng)]
      (Thread/sleep 10)
      (mutate! x1 + amt)
      (mutate! x2 - amt))))

(defn rand-xfers! [xs n]
  (dotimes [_ n]
    (let [x1 (rand-nth xs)
          x2 (rand-nth xs)]
      (rand-xfer! x1 x2))))

(defn rand-xfers-thread [xs n]
  (Thread. #(rand-xfers! xs n)))

(defn sum-xs [xs]
  (apply + (map look xs)))

(defn done? [thread]
  (= Thread$State/TERMINATED (.getState thread)))

(defn make-xs [create-fn n]
  (take n (repeatedly create-fn)))

(defn rand-xfers-conc!
  [create-fn nxs nthreads nxfers-per-thread]
  (let [xs (make-xs create-fn nxs)
        init-sum (sum-xs xs)
        threads (make-xs
                  #(rand-xfers-thread xs nxfers-per-thread)
                  nthreads)]
    (doseq [thread threads] (.start thread))
    (loop []
      (when-not (every? done? threads)
        (Thread/sleep 10) ; don't hog the CPU here
        (recur)))
    (println "init-sum =" init-sum)
    (println "final-sum =" (sum-xs xs))
    (println "final-minimum =" (apply min (map look xs)))))



A note on locking:

Reducing deadlocks due to locking is possible with a clever little trick.

(let [switch? (<
                (System/identityHashCode x1)
                (System/identityHashCode x2))
      [x1 x2] (if switch? [x2 x1] [x1 x2])]
  (locking x1
    (locking x2
      ...)))

This locks any given two resources in descending order of identity
hash code. The identity hash code is not guaranteed to be unique per
live object but collisions seem to be exceedingly rare on most JVMs.
The risk of deadlock is greatly reduced because locks on the same two
objects are almost always acquired in the same order.

A somewhat safe, general multi-object locking macro can be built on this base:

(defn- locking-wrap [n obs body]
  (if (zero? n)
    body
    [`(locking (nth ~obs ~(dec n))
       ~@(locking-wrap (dec n) obs body))]))

(defmacro locking-all [obs & body]
  (let [ooo (gensym)]
    `(let [~ooo (sort-by
                    (fn [ob#] (System/identityHashCode ob#))
                    ~obs)]
       ~@(locking-wrap (count obs) ooo body))))

Testing:

=> (def xxx 42)
#'user/xxx
=> (def yyy 192)
#'user/yyy
=> (def zzz 1024)
#'user/zzz
=> (locking-all [xxx yyy zzz]
     (doall (map #(Thread/holdsLock %) [xxx yyy zzz])))
(true true true) ; It does acquite all three locks.
=> (do
     (doto
       (Thread. #(locking-all [xxx yyy zzz]
                   (Thread/sleep 5000)))
       (.start))
     (Thread/sleep 100)
     (locking-all [yyy zzz xxx] (println "boo!")))
; 5 second pause
boo! ; Look, ma! No deadlock! Even with the items rearranged!
nil

The wild card here is that locking-all has a small but positive chance
to cause deadlock, particularly on 64-bit JVMs. It would require an
identity hash collision between two resources that were being locked
by different threads at the same time, and with the resources given in
opposite orders to locking-all (assuming that sort-by performs a
stable sort). The odds are literally trillions to one against,
assuming a hash space of size 2^32 (the maximum possible given that
System.identityHashCode() returns int) and fairly frequent locking of
fairly many resources, but it would happen eventually. The real hash
space is potentially smaller: if truncated to the least significant 32
bits and the object's address and objects are aligned to 8 byte
boundaries on a 64-bit JVM, only 2^24. Another factor that might allow
rare collisions even on 32-bit JVMs is that the object may eventually
get relocated by the GC if it's long-lived enough. The identity hash
code has to be constant, so if it's the object's address, it's the
object's address when *first created*. Subsequently the object may be
moved by the GC, and then a new object could be created at the same
initial address, yielding a hash collision. The odds of these
particular two objects ever being locked together at all, let alone
simultaneously by different threads in opposite orders, is
microscopic, but add up enough object pairs over a long enough JVM
session lifespan and eventually kaboom!

So locking-all, even in the best case, can only (drastically!) reduce
the frequency of deadlocks; it cannot eliminate deadlocks entirely,
even on 32-bit JVMs. It also suffers another problem, namely it
doesn't cope with the situation where the nested critical sections are
not all in the same function. Suppose we have functions foo and bar
that lock an x and baz and quux that lock a y, and bar takes an x and
a y and calls quux on the y while baz takes an x and a y and calls foo
on the x, then if one thread calls bar with a particular x and y at
the same time as another calls baz with the same x and the same y you
can have deadlock. Rearranging things so that xs are always locked
before (or always locked after) ys fixes this, but may require putting
the program logic through contortions away from the natural expression
of the domain problem. And this gets exponentially worse the larger
the number of simultaneously held locks, e.g. if zs enter the mix,
while locking-all won't help you one tiny bit.

Transactions are therefore preferable to the locking-all macro given
above if you don't want to have to reboot the heavily-used server (or
whatever) every few days, or have customers complaining that their
fancy new graphics app hangs now and again, only roughly as often as
Friday the 13ths roll around but sometimes when they have lots of
unsaved work.

Another reason to prefer transactions, of course, is that they support
commute and a form of read/write locking that permits many readers
(even in transactions, when not ensured). These may make transactions
actually more performant in some cases, as well as absolutely
deadlock-proof in every case.

-- 
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Simple thread-safety demo

Reply via email to