Thanks. We'll take a look into it and see if we can fix it. Bill.
On 14 August 2017 at 16:51, <csm...@bristol.ac.uk> wrote: > Hi, > > I've dug a bit deeper, and it seems that there is an alignment issue > within addmul_1. I've created two marginally different programs, of which > one is much faster: > > $ gcc -O3 addmul_1.s -o addmul_1.o -c; for i in a b; do gcc -O3 $i.cpp -o > $i.out addmul_1.o; ./$i.out ; done > Time: 0.490647 > Time: 0.681671 > > addmul_1.s is the Sandy Bridge-optimized assembly. When I change the > alignment there a bit as in addmul_1.opt.s, the difference disappears: > > $ gcc -O3 addmul_1.opt.s -o addmul_1.o -c; for i in a b; do gcc -O3 $i.cpp > -o $i.out addmul_1.o; ./$i.out ; objdump -CSD $i.out > $i.dis ;done > Time: 0.505714 > Time: 0.495640 > > Best regards, > Marcel > > > On Friday, August 11, 2017 at 7:00:10 PM UTC+1, Bill Hart wrote: >> >> We've noticed similar sorts of things. One possibility is that the loop >> in your test code is not aligned as well in one version. Or perhaps your >> stack is hitting the same location modulo 4096, which is a known issue on >> some modern processors. There might be SSE code in the linker and AVX code >> in the addmul_1 function. The kernel might pin the process to a different >> CPU which is slightly slower or faster, when the pthreads library is used. >> You might also hit some frequency scaling in the CPU due to the pthreads >> library taking longer to link in. There's so many possibilities on a modern >> CPU, it hardly bears thinking about. >> >> Also, in your code, you don't seem to set y anywhere and I wasn't aware >> you could use 1e8 as an int constant. >> >> Bill. >> >> On 11 August 2017 at 18:10, Marcel Keller <m.ke...@bristol.ac.uk> wrote: >> >>> Hi, >>> >>> I've noticed that the performance of mpn_addmul_1 can depend >>> considerably on whether I link against libpthread, which strikes me as very >>> weird: >>> >>> $ g++ -O3 Time-addmul_1.cpp ~/src/mpir-3.0.0-ivybridge/mpn/addmul_1.o >>> -o a.out >>> >>> $ g++ -O3 Time-addmul_1.cpp ~/src/mpir-3.0.0-ivybridge/mpn/addmul_1.o >>> -o b.out -lpthread >>> >>> $ ./a.out >>> mpn_addmul_1: 0.506279 >>> >>> $ ./b.out >>> mpn_addmul_1: 0.682086 >>> >>> Disassembling the binaries shows that the mpn function in >>> Time-addmul_1.cpp is compiled exactly the same way. >>> >>> I'm running CentOS 7 and GCC 6.2. The source as well as the outputs are >>> attached. >>> >>> Does anyone have any idea why this could be? >>> >>> Best regards, >>> Marcel >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "mpir-devel" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to mpir-devel+...@googlegroups.com. >>> To post to this group, send email to mpir-...@googlegroups.com. >>> Visit this group at https://groups.google.com/group/mpir-devel. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "mpir-devel" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to mpir-devel+unsubscr...@googlegroups.com. > To post to this group, send email to mpir-devel@googlegroups.com. > Visit this group at https://groups.google.com/group/mpir-devel. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+unsubscr...@googlegroups.com. To post to this group, send email to mpir-devel@googlegroups.com. Visit this group at https://groups.google.com/group/mpir-devel. For more options, visit https://groups.google.com/d/optout.