On 2009-12-15 19:49:43 -0500, dsimcha <[email protected]> said:

== Quote from Simen kjaeraas ([email protected])'s article
Tested this on a Core 2 Duo, same options. OS is Windows 7, 64bit. It
scales roughly inverse linearly with number of threads:
163ms for 1,
364ms for 2,
886ms for 4
This is quite different from your numbers, though.

Yea, forgot to mention my numbers were on Win XP.  Maybe Windows 7 critical
sections are better implemented or something.   Can a few other people with a
variety of OS's run this benchmark and post their numbers?

Core 2 Duo / Mac OS X 10.6 / 4 threads:

        Crystal:~ mifo$ ./test
        Set affinity, then press enter.

        Bus error

Runs for about 18 seconds, then crashes. At first glance, it looks as if the Thread class is broken and for some reason I get a null dereference when a thread finishes. Great!

Anyway, I've done some sampling on the program while it runs, and each of the worker thread spans about 85% of its time inside _d_monitorenter and 11% in _d_monitorleave soon after starting the program, which later becomes 88% and 7% respectively soon before the program finishes.

The funny things is that if I just bypass the GC like this:

        void doAppending() {
                uint* arr = null;
                foreach(i; 0..1_000_000) {
                        arr = cast(uint*)realloc(arr, (uint*).sizeof * (i+1));
                        arr[i] = i;
                }
                // leak arr
        }

it finishes (I mean crashes) in less than half a second. So it looks like realloc does a much better job at locking it's data structure that the GC.

--
Michel Fortin
[email protected]
http://michelf.com/

Reply via email to