On Oct 25, 2006, at 11:51 AM, Bryan Sant wrote:

I always just go with threads.  But then again, I do a lot of desktop
software, where interaction between components is frequent and shared
memory is more efficient, reliable, and convenient than message
passing via pipes or some other IPC mechanism.  I'm not saying that
Levi's points aren't valid, on the contrary, they are.  Memory space
protection provided by a process is valuable...  Valuable if you're
using C, or some other language that can stomp on or leak memory.  If
you're using a language with memory management (Perl, C#, Java, Lisp),
then the protection provided by processes has little value and some
down sides.

You're conflating two different problems here. First, there is the problem of memory safety. C and C++ allow you to fairly easily write into memory that you should not write into. Memory-safe languages don't let you do it at all. Memory safety requires that all primitive memory allocation and especially deallocation be managed by the language runtime, typically via a garbage collector.

The second problem is the nondeterministic interleaving of execution that exists in the shared-memory concurrency model. Every heap variable is, by default, shared by all threads. Since the scheduler can switch between threads at arbitrary times, a program that uses heap variables naively will almost certainly behave unpredictably and not do what you want. Enter locks. They allow you to re-serialize the execution of your program in certain areas, so only one thread can run at a time. This solves one problem, but creates a few more.

First of all, you must remember to put locks in all the right places. Some higher level languages help out quite a bit with this, but if you're doing raw pthreads in C, it's pretty easy to screw up and create a race condition, where nondeterminism creeps into your program again. And in any language higher-level than assembly, it's entirely possible that an operation that looks atomic on the surface (i.e., can't be broken down any further in that language) actually consists of many machine operations, so the scheduler could switch to a different thread /in the middle/ of that operation. Doing shared- memory concurrency safely in a high-level language requires a lot of information about the implementation of that language, which kind of defeats the purpose.

Second, you are hampered in your ability to create new abstractions. When multiple shared resources are involved, you must be careful to obtain and release the locks in the correct order. This is a pain, it creates concerns that cross abstraction barriers, and is generally an impediment to good software design practices.

Finally, locks can create performance issues. The purpose of a lock is to serialize your program, and if there are too many of them, your amount of parallelism drops through the floor and you end up with a serial program. In the worst case, you can deadlock and bring the program to a halt. Getting good performance with locks along with elimination of 100% of race conditions and deadlocks is a very hard thing to do. As the amount of concurrency goes up, the performance penalty of locks and the chance of hitting a lurking race condition goes up, too.

So, I hope that made the distinction between the problems caused by lack of memory safety and the problems caused by shared-state concurrency clear. Regardless of the problems, both are still sometimes the right solution. They just shouldn't be the DEFAULT solution for a programmer who wants to write correct code, in general. Some particular high-level languages and programming environments make using any other concurrency paradigm at least as difficult; programmers in such environments are simply screwed, and should demand better tools.

You can achieve a much more natural programming model by
using threads and semaphores, than processes and marshaled messages.

What feels natural to do is largely defined by the language you are using, so that is only true for a subset of languages. I would argue that languages that make shared-state concurrency the most natural way to approach a problem ought to be redesigned so that shared-state concurrency is well-supported when necessary, but alternatives feel just as (or more, preferably) natural.

You have also left out one important option from your list, though; threads that by default share nothing, but can explicitly ask for regions of memory to be shared. Combine that with software transactional memory (aka optimistic or lock-free concurrency) and message-passing and deterministic concurrency whenever they are appropriate, and you can use the tool that suits your problem and eliminate the possibility of large classes of programming errors, just like memory protection eliminates another large class of programming errors.

                --Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Reply via email to