Re: STM and useful concurrency

Howard Lewis Ship Tue, 24 Mar 2009 15:40:31 -0700

A relevant question is: what is the relative cost of locking and
blocking (in the pure Java approach) vs. the cost of retrying (in the
Clojure/STM approach).


I don't want to go out on a limb, having not looked at the Clojure STM
implementation. However, I would bet that the costs are roughly equal.
Even if Clojure was 50% slower, or 100% slower, the knowlege that you
can spin up a large number of threads and not worry about deadlocks is
ultimately more valuable.

On Mon, Mar 23, 2009 at 12:36 PM, Mark Volkmann
<r.mark.volkm...@gmail.com> wrote:
>
> On Mon, Mar 23, 2009 at 11:19 AM, Cosmin Stejerean <cstejer...@gmail.com> 
> wrote:
>>
>> On Sun, Mar 22, 2009 at 9:12 PM, Mark Volkmann <r.mark.volkm...@gmail.com>
>> wrote:
>>>
>>> I'm trying to understand the degree to which Clojure's STM provides
>>> more concurrency than Java's blocking approach. I know it's difficult
>>> to make generalizations and that specific applications need to be
>>> measured, but I'll give it a go anyway.
>>>
>>> Clearly using STM (dosync with Refs) makes code easier to write than
>>> using Java synchronization because you don't have to determine up
>>> front which objects need to be locked. In the Clojure approach,
>>> nothing is locked. Changes in the transaction happen to in-transaction
>>> values and there is only a small amount of blocking that occurs at the
>>> end of the transaction when changes are being committed. Score one for
>>> Clojure!
>>>
>>> What concerns me though is how often the work done in two transactions
>>> running in separate threads turns out to be useful work. It seems that
>>> it will typically be the case that when two concurrent transactions
>>> access the same Refs, one of them will commit and the other will
>>> retry. The retry will discard the in-transaction changes that were
>>> made to the Refs, essentially rendering the work it did of no value.
>>> So there was increased concurrency, but not useful concurrency.
>>
>> In the case where two transactions need to modify the same Ref they
>> definitely to be serialized, either by explicitly using locks in Java, or by
>> letting Clojure automatically retry one of them. In either case it about the
>> same thing happens. Transaction A starts and finishes, then Transaction B
>> starts and finishes.
>
> I don't think the same thing happens. In the case of Clojure, both
> transaction A and B start. Suppose A finishes first and commits. Then
> transaction B retries, finishes and commits. That's what I was
> referring to as non-useful work. I'm not saying it's the wrong
> approach, but it is different.
>
>>> Of course there is a chance that the transaction contains some
>>> conditional logic that makes it so the Refs to be accessed aren't
>>> always the same, but my speculation is that that's are rare
>>> occurrence. It's probably more typical that a transaction always
>>> accesses the same set of Refs every time it executes.
>>
>> Which Refs your transactions modify will depend heavily based on the
>> specific application you are working on. For example I can imagine that an
>> application dealing with bank accounts and transferring money between them
>> the probability of two transactions concurrently hitting the same account is
>> pretty low. In other applications where a lot of transactions modify the
>> same global state the chances of conflicts are much higher.
>
> Agreed.
>
>>> This makes it seem that Java's locking approach isn't so bad. Well,
>>> it's bad that I have to identify the objects to lock, but it's good
>>> that it doesn't waste cycles doing work that will just be thrown away.
>>
>> There's a reason concurrent programming is notoriously hard in most
>> languages, because it takes a lot of effort and skill to get right. Between
>> having to correctly identify which objects need to be locked and trying to
>> avoid deadlocks dealing with explicit locks can be pretty messy and
>> dangerous. That doesn't mean Java's approach is bad, after all the internals
>> of Clojure are implemented using Java locks. But explicit management of
>> locks is often too low level and unnecessarily complex, and Clojure provides
>> a higher level way of dealing with concurrency that makes it easier and
>> safer to work with most of the time.
>
> I agree that Clojure makes the programming much easier, but is there a
> downside? Maybe the downside is performance. If I found out that a
> particular transaction was commonly being retried many times, is that
> a sign that I need to write the code differently? How would I find out
> that was happening? I know I could insert my own code to track that,
> but it seems like this may be a commonly needed tool for Clojure to
> detect excessive conflicts/retries in transactions. Maybe we could set
> a special variable like *track-retries* that would cause Clojure to
> produce a text file that describes all the transaction retries that
> occurred in a particular run of an application. If such a tool isn't
> needed or wouldn't be useful, I'd like to understand why.
>
> --
> R. Mark Volkmann
> Object Computing, Inc.
>


-- 
Howard M. Lewis Ship

Creator Apache Tapestry and Apache HiveMind

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: STM and useful concurrency

Reply via email to