I started this thread surprised by the little overhead but it turned out to be 
true only for most modern CPUs. An overhead of 10% for some/most CPUs is quite 
high.

However! What a modern CPU does today is what most other CPUs do tomorrow so 
`atomicArc` seems to have a bright future. On the other hand the RISC V people 
will ruin everything with their "worse-is-better you don't need integer 
arithmetic checking or sane addressing modes" mentality. (Sorry to digress but 
I really don't like RISC V.)

But maybe currently arc vs atomicArc is the wrong question. Just turning on 
`--threads` seems to be a performance killer. I used to blame MingW's thread 
local storage implementation but plenty of OS/CPU combinations seem to be 
affected? What is going on? Access to thread local storage should not be this 
slow, it's supposed to be really fast...

Reply via email to