Re: Parallelism and Concurrency was Re: Ideas for a "Object-Belongs-to-Thread" threading model (nntp: message 20 of 20 -last one!-)

nigelsandever Sun, 16 May 2010 12:05:56 -0700

On Fri, 14 May 2010 17:35:20 +0100, B. Estrade - estr...@gmail.com<+nntp+browseruk+c4c81fb0fa.estrabd#gmail....@spamgourmet.com> wrote:

The future is indeed multicore - or, rather, *many-core. What this
means is that however the hardware jockeys have to strap them together
on a single node, we'll be looking at the ability to invoke hundreds
(or thousands) of threads on a single SMP machine.

There are very few algorithms that actually benefit from using even lowhundreds of threads, let alone thousands. The ability of Erlang (and go anIO and many others) to spawn 100,000 threads makes an impressive demo forthe uninitiated, but finding practical uses of such abilities is very hard.

One example cited is that of gaming software that runs each sprite inaseparate "thread". The claim is that this simplifies code because eachsprite only has to respond to situations directly applicable to it, ratherthan some common sprite handler having to select which sprite to operateupon. But all it does is move the goal posts. You either have to selectwhich sprite to send a message to; or send a message to the spritehandler and have it select the sprite to operate upon.

A third technique is to send the message to all the sprites and have thendecide if it is applicable to them. But it still requires a loop, and youthen have the communications overhead *100,000 + the context witch costs *100,000. The numbers do not add up.

Then, inevitably,
*someone will want to strap these together into a cluster, thus making
message passing an attractive way to glue related threads together
over a network.  Getting back to the availability of many threads on a
single SMP box, issues of data locality and affinity and thread
binding will become of critical importance.

Perhaps surprisingly, these are not the issues they once were. Whilstcache misses are horribly expensive, the multi-layered caching in modernCPUs combines with deep pipelines, branch prediction, register renamingand other features in ways that are beyond the ability of the human mindto reason about.


For a whirlwind introduction to the complexities, see the short video here:

http://www.infoq.com/presentations/click-crash-course-modern-hardware

The only way to test the affects is to profile, and most of the researchinto the effects of cache locality tend to be done in isolation ofreal-world application mixes. very few machines, even servers of varioustypes, run a single application these days. This is even truer as servervirtualisation becomes ubiquitous. Mix in a soupçon of virtual serverload-balancing and trying to code for cache locality becomes almostimpossible.

These issues are closely
related to the operating system's capabilities and paging policies, but
eventually (hopefully) current, provably beneficial strategies will be
available on most platforms.

Brett

Re: Parallelism and Concurrency was Re: Ideas for a "Object-Belongs-to-Thread" threading model (nntp: message 20 of 20 -last one!-)

Reply via email to