Re: [fonc] Everything You Know (about Parallel Programming) Is Wrong!: A Wild Screed about the Future

David Barbour Wed, 04 Apr 2012 12:24:20 -0700

On Wed, Apr 4, 2012 at 9:00 AM, Miles Fidelman
<[email protected]>wrote:


> The whole point of architecture is to generate the overall outline of a
> system, to address a particular problem space within the constraints at
> hand. The KISS principle applies (along with "seek simplicity and distrust
> it"). If there isn't a degree of simplicity and elegance in an
> architecture, the architect hasn't done particularly good job.
>

I agree.


>
> In the past, limitations of hardware, languages, and run-time environments
> have dictated against taking parallel (or more accurately, concurrent)
> approaches to problems, even when massive concurrency is the best mapping
> onto the problem domain - resulting in very ugly code.
>

I'd say the bigger problem is that there haven't been very good concurrency
models that ever reached mainstream. The choices have been threads and
locks, processes, maybe a transactional database.

Outside of mainstream, there are a lot more options. Lightweight time warp.
Synchronous reactive. Temporal logic. Event calculus. Concurrent
constraint. Temporal concurrent constraint. Functional reactive
programming.

Not all models of concurrency are well suited for parallelism, but many can
be so with just a few tweaks (e.g. multi-clock synchronous reactive). I
understand parallelism to be an implementation detail, and concurrency to
be a domain or semantic detail.


>
> Yes, there are additional problems introduced by modeling a problem as
> massively concurrent


Well, not inherently. I'd note that your example of 5 loops with 2000 tanks
at 20Hz is essentially an implementation of a step-clocked concurrency
model. It just happens to be represented in a sequential programming
language, so you get a bunch of semantic noise (i.e. as an outside observer
you know that all 2000 computations of line-of-sight are independent of one
another, but that isn't obvious in the language).

But your particular choice for massive concurrency - asynchronous processes
or actors - does introduce many additional problems.


> - control latency, support replay, testing, maintenance, verification:
> these are nothing new at the systems level (think about either all the
> different things that run on a common server, or about all the things that
> go on in a distributed system such as the federated collection of SMTP
> servers that we're relying on right now)
>

Yes, these are old problems at the systems level. Mostly unsolved. I've
seen e-mails take 5-6 days to get through. We have almost no ability to
reason about system invariants. We need constant human administration to
keep our systems up and running. I'm quite familiar with it.


>
> - consistency: is not your message "Avoid the Concurrency Trap by
> Embracing Non-Determinism?" -- is not a key question: what does it mean to
> "embrace non-determinism" and how to design systems in an inherently
> indeterminate environment? (more below)
>

Uh, no. But I see below you confused me with another David. Perhaps see my
Mar 28 comment in this thread. I reject Ungar's position.


>
>> The old sequential model, or even the pipeline technique I suggest, do
>> not contradict the known, working structure for consistency.
>>
>
> But is consistency the issue at hand?
>

Yes. Of course, it wasn't an issue until you discarded it in pursuit of
your `simple` concurrency model.


>
> This line of conversation goes back to a comment that the limits to
> exploiting parallelism come down to people thinking sequentially, and
> inherent complexity of designing parallel algorithms. I argue that quite a
> few problems are more easily viewed through the lens of concurrency - using
> network protocols and military simulation as examples that I'm personally
> familiar with.
>

I have not argued that people think sequentially or that parallel
algorithms are inherently complex. I agree that many problems are well
viewed through the lens of concurrency.

But your proposed approach to concurrency is not easier, not once you
account for important problems solved by the original simulator that you
chose to ignore.


>
> You seem to be making the case for sequential techniques that maintain
> consistency.


Pipelines are a fine sequential technique, of course, and I think we should
use them often. But more generally I'd say what we need is effective
support for synchronous concurrent behavior - i.e. to model two or more
things happening at the same time.


> "If we cannot skirt Amdahl’s Law, the last 900 cores will do us no good
> whatsoever. What does this mean? We cannot afford even tiny amounts of
> serialization."
>
> "Avoid the Concurrency Trap by Embracing Non-Determinism?" (actually not
> from the post, but from the Project Renaissance home page)
>
> In this, I think we're in violet agreement - the key to taking advantage
> of parallelism is to "embrace non-determinism."
>

I disagree with Ungar.

I don't disagree with Amdahl, but Parkinson's law serves as a partial
counter.


>
> In this context, I've been enjoying Carl Hewitt's recent writings about
> indeterminacy in computing. If I might paraphrase a bit, isn't the point
> that 'complex computing systems are inherently and always indeterminate,
> let's just accept this, not try to force consistency where it can't be
> forced, and get on with finding ways to solve problems in ways that work in
> an indeterminate environment.'
>

I also disagree with Hewitt (most recently at
http://lambda-the-ultimate.org/node/4453). Ungar and Hewitt both argue "we
need indeterminism, so let's embrace it". But they forget that every
medicine is a poison if overdosed or misapplied, and it doesn't take much
indeterminism to become poisonous.

To accept and tolerate indeterminism where necessary does not mean to
embrace it. It should be controlled, applied carefully and explicitly.


>
>  a) you selectively conceptualize only part of the system - an idealized
>> happy path. It is much more difficult to conceptualize your whole system -
>> i.e. all those sad paths you created but ignored. Many simulators have
>> collision detection, soft real-time latency constraints, and consistency
>> requirements. It is not easy to conceptualize how your system achieves
>> these.
>>
>
> In this one, I write primarily from personal experience and observation.
> There are a huge class of systems that are inherently concurrent, and
> inherently not serializeable. Pretty much any distributed system comes to
> mind - email and transaction processing come to mind. I happen to think
> that simulators fall into this class - and in this regard there's an
> existence proof:
>

In general, the asynchronous semantics you get with processes and actors
are also a poor map to simulation problems. In my experience, the best
approaches to simulation involve synchronous programming of some sort -
event calculus, step clock, synchronous reactive, temporal logic,
functional reactive, reactive demand programming, etc.

The basic reason for this is that you: (a) want to model lots of things
happening at once (not `asynchronously` but truly `at the same time`), (b)
you don't want any participant to have special advantage by ordering in a
turn, (c) you want consistency, freedom from glitches and anomalies, and
ability to debug and regression-test your model, (d) you want precise
real-time reactions - i.e. as opposed to delaying messages indefinitely.

It seems you have not much experience developing or working with simulators
in these stepped models.

I find unnecessarily applying asynchronous and indeterministic concurrency
semantics to simulations to be just as `ugly` as applying unnecessary
sequential semantics when we really want a middle ground: synchronous
concurrency semantics.


>  b) parallelism is not concurrency; it does not suggest actor-like
>> approaches. Pipeline and data parallelism are well proven alternatives used
>> in real practice. There are many others, which I have mentioned before.
>>
>
> Fair point. If we limit ourselves to a discussion of pipelines and data
> parallelism, I'll concede that they do not necessarily lead to cleaner
> conceptual mappings between problems and systems architectures.


Indeed not. Fortunately, pipelines and data-parallelism work very nicely
with synchronous concurrency models, which can lead to a cleaner mapping
between problem and semantics.



In the case of email, I can't even begin to think about applying
> synchronous parallelism to messages flowing a federation of mail servers.
>

That's a rather facetious case, of course. E-mail is defined by an
asynchronous protocol. But I can easily model asynchronous communication in
a synchronous system by use of intermediate shared state (a database, for
example).


> On the other hand, if we look at the larger question of "skirting Amdah's
> law" in an environment with lots of processing cores - certainly within
> some definitions of "parallelism" - then actor-like massive concurrency
> approaches are certainly in bounds, and the availability of more cores
> certainly allows for running more actors without running into resource
> conflicts.


It doesn't actually help us skirt Amdahl's law. If the tank-actors share
any references (e.g. to actors representing their environment) then you
will still serialize that portion of the problem.

Saying we can find work for the cores is just Parkinson's law. And it
applies regardless of the method used, so is something of a cop out.

Regards,

David BARBOUR


-- 
bringing s-words to a pen fight

_______________________________________________
fonc mailing list
[email protected]
http://vpri.org/mailman/listinfo/fonc

Re: [fonc] Everything You Know (about Parallel Programming) Is Wrong!: A Wild Screed about the Future

Reply via email to