On 18 August 2011 00:02, Eliot Miranda <[email protected]> wrote: > Hi Andrew, > > On Wed, Aug 17, 2011 at 1:50 PM, Andrew P. Black <[email protected]> wrote: >> >> I was looking at some old slides of Joe Armstrong on Concurrency-orinted >> programming. He set the following challenge: >> >> Put N processes in a ring: >> Send a simple message round the ring M times. >> Increase N until the system crashes. >> How long did it take to start the ring? >> How long did it take to send a message? >> When did it crash? >> >> He gave graphs comparing Erlang, Java and C#. I decided to compare Pharo. >> Here is what I got; the creation times are PER PROCESS and the messaging >> times are PER MESSAGE. >> >> first run >> >> procs creation/µs msg/µs >> >> 200 0 7.0 >> 500 0 9.7 >> 1000 2 15.4 >> 2000 1 21.6 >> 5000 13 31.5 >> 10000 19.9 40.7 >> 20000 46.5 55.4 >> 50000 130.9 98.0 >> >> second run >> >> procs creation/µs msg/µs >> >> 200 0.0 7.0 >> 500 0.0 10.12 >> 1000 0.0 16.53 >> 2000 1.5 24.26 >> 5000 12.8 32.15 >> 10000 28.1 39.497 >> 20000 58.15 52.0295 >> 50000 75.1 95.581 >> >> third run >> >> procs creation/µs msg/µs >> >> 200 0.0 7.0 >> 500 0.0 8.6 >> 1000 2.0 11.0 >> 2000 1.0 16.55 >> 5000 10.2 21.76 >> 10000 12.0 49.57 >> 20000 52.35 65.035 >> 50000 91.76 117.1 >> >> Each process is a Pharo object (an instance of ERringElement) that >> contains a counter, a reference to the next ERringElement, and an >> "ErlangProcess" that is a Process that contains a reference to an instance >> of SharedQueue (its "mailbox"). >> >> The good news is that up to 50k processes, it didn't crash. But it did >> run with increasing sloth. >> >> I can imagine that the increasing process-creation time is due to beating >> on the memory manager. But why the increasing message-sending time as the >> number of processes increases? (Recall that exactly one process is runnable >> at any given time). I'm wondering if the scheduler is somehow getting >> overwhelmed by all of the non-runable processes that are blocked on >> Semaphores in SharedQueue. Any ideas? > > If you're using Cog then one reason performance falls off with number of > processes is context-to-stack mapping, see 08 Under Cover Contexts and the > Big Frame-Up. Once there are more processes than stack pages every process > switch faults out a(t least one) frame to a heap context and faults in a > heap context to a frame. You can experiment by changing the number of stack > pages (see vmAttributeAt:put:) but you can't have thousands of stack pages; > it uses too much C stack memory. I think the default is 64 pages and > Teleplace uses ~ 112. Each stack page can hold up to approximately 50 > activations. > But to be sure what the cause of the slowdown is one could use my > VMProfiler. Has anyone ported this to Pharo yet?
Hmm, as to me it doesn't explains why messages/second degrading linearly to number of processes. Because after hitting certain limit (all stack pages are full), the messages/time proportion should not degrade anymore. Given your explanation, i would expect something like following: 10000 - 49.57 ... somewhere here we hit stack pages limit ... 20000 - 65.035 30000 - 65.035 40000 - 65.035 50000 - 65.035 >> >> (My code is on Squeaksource in project Erlang. But be warned that there >> is a simulation of the Erlang "universal server" in there too. To run this >> code, look for class ErlangRingTest.) > > > > -- > best, > Eliot > -- Best regards, Igor Stasenko AKA sig.
