Hi Andrew, On Wed, Aug 17, 2011 at 1:50 PM, Andrew P. Black <[email protected]> wrote:
> I was looking at some old slides of Joe Armstrong on Concurrency-orinted > programming. He set the following challenge: > > Put N processes in a ring: > Send a simple message round the ring M times. > Increase N until the system crashes. > How long did it take to start the ring? > How long did it take to send a message? > When did it crash? > > He gave graphs comparing Erlang, Java and C#. I decided to compare Pharo. > Here is what I got; the creation times are PER PROCESS and the messaging > times are PER MESSAGE. > > first run > > procs creation/µs msg/µs > > 200 0 7.0 > 500 0 9.7 > 1000 2 15.4 > 2000 1 21.6 > 5000 13 31.5 > 10000 19.9 40.7 > 20000 46.5 55.4 > 50000 130.9 98.0 > > second run > > procs creation/µs msg/µs > > 200 0.0 7.0 > 500 0.0 10.12 > 1000 0.0 16.53 > 2000 1.5 24.26 > 5000 12.8 32.15 > 10000 28.1 39.497 > 20000 58.15 52.0295 > 50000 75.1 95.581 > > third run > > procs creation/µs msg/µs > > 200 0.0 7.0 > 500 0.0 8.6 > 1000 2.0 11.0 > 2000 1.0 16.55 > 5000 10.2 21.76 > 10000 12.0 49.57 > 20000 52.35 65.035 > 50000 91.76 117.1 > > Each process is a Pharo object (an instance of ERringElement) that contains > a counter, a reference to the next ERringElement, and an "ErlangProcess" > that is a Process that contains a reference to an instance of SharedQueue > (its "mailbox"). > > The good news is that up to 50k processes, it didn't crash. But it did run > with increasing sloth. > > I can imagine that the increasing process-creation time is due to beating > on the memory manager. But why the increasing message-sending time as the > number of processes increases? (Recall that exactly one process is runnable > at any given time). I'm wondering if the scheduler is somehow getting > overwhelmed by all of the non-runable processes that are blocked on > Semaphores in SharedQueue. Any ideas? > If you're using Cog then one reason performance falls off with number of processes is context-to-stack mapping, see 08 Under Cover Contexts and the Big Frame-Up<http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-big-frame-up/>. Once there are more processes than stack pages every process switch faults out a(t least one) frame to a heap context and faults in a heap context to a frame. You can experiment by changing the number of stack pages (see vmAttributeAt:put:) but you can't have thousands of stack pages; it uses too much C stack memory. I think the default is 64 pages and Teleplace uses ~ 112. Each stack page can hold up to approximately 50 activations. But to be sure what the cause of the slowdown is one could use my VMProfiler. Has anyone ported this to Pharo yet? > (My code is on Squeaksource in project Erlang. But be warned that there is > a simulation of the Erlang "universal server" in there too. To run this > code, look for class ErlangRingTest.) > -- best, Eliot
