http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56538
Bug #: 56538 Summary: No opiton to disable slow 'lock' instr. one does not need Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: jan.kratoch...@redhat.com Target: x86_64-unknown-linux-gnu When no threads / shared mem is in use 'lock' has no benefits. But it still has heavy performance hit. #include <memory> #include <unistd.h> #include <sys/wait.h> int main() { const int jobs=8; pid_t pid[jobs]; std::shared_ptr<int> p(new int()); for (int job=0;job<jobs;job++) if ((pid[job]=fork())==0) { for (int i=0;i<1000000000;i++) std::shared_ptr<int> q(p); _exit(0); } for (int job=0;job<jobs;job++) waitpid(pid[job],NULL,0); } g++ -o bench1 bench1.C -Wall -g -std=gnu++11 -O3 7.816s g++ bench1.C -Wall -g -std=gnu++11 -O3 -S -o -|g++ -Dlock= -o bench1 -x assembler-with-cpp - -lstdc++ 7.150s = 9.3% improvement fork() has to be there to see an improvement - Intel i7-920 probably ignores 'lock' when only single core runs. gcc-4.8.0-0.14.fc19.x86_64