Since we don't have a string perf test that I could find, I wrote up a quick and dirty one that just made many copies of the same string repeatedly to exercise the atomic increment/decrement. The results show a 3% performance penalty when using the newer atomic functions. This test was run with an 8d configuration, so the atomic functions were compiled into the stdcxx dll. The test hardware is a Lenovo T60p [Intel Core 2 T7600 2.33GHz CPU, 2GB RAM].
Old new [patched] ------ 1 threads ------ 1 threads ms 714 ms 737 ms/op 0.00004256 ms/op 0.00004393 ------ 2 threads ------ 2 threads ms 3911 ms 4024 ms/op 0.00023311 ms/op 0.00023985 ------ 4 threads ------ 4 threads ms 7660 ms 7865 ms/op 0.00045657 ms/op 0.00046879 ------ 8 threads ------ 8 threads ms 15192 ms 15585 ms/op 0.00090551 ms/op 0.00092894 I'm wondering if we used inline assembly for the __rw_atomic_* functions if the cost would be reduced. We could also evaluate the intrinsic pragma that is available on MSVC. Travis >-----Original Message----- > >I will do a quick run using the string performance test after lunch. >I'll report the results on that later. I've pasted the source for the >bulk of my test below. If someone wants the entire thing, let me know >and I'll provide everything. > >Travis >