I tried replacing fabs with a couple different manual implementations to no effect.
On Tue, Jan 27, 2015 at 6:50 PM, Stefan Karpinski < [email protected]> wrote: > Is it possible that C can't/isn't inlining fabs? > > > > On Jan 27, 2015, at 5:39 PM, Miles Lubin <[email protected]> wrote: > > > > I'm working on a microbenchmark and found a surprising result (maybe > not?) that Julia is about 2x faster than the same algorithm hand-coded in > C. I wanted to check if I'm doing anything obviously wrong here before > reporting these results. The timings reproduce across different systems and > compiler options (clang/gcc -O2/-O3). > > > > The test is just to compute square root using newton's method. The > relevant code is in this gist: > https://gist.github.com/mlubin/4994c65c7a2fa90a3c7e. > > > > On Julia 0.3.5, each function call takes 8.85*10^-8 seconds. The best > timing I've seen from C is 1.61*10^-7 using gcc -O2 -march=native. > > > > I did my best to check for common mistakes: > > - Julia and C use the exact same timing routine with 10,000 repetitions > > - Both give the correct answer, and the important code isn't being > optimized away. > > > > Any ideas as to why Julia is faster on this very simple code? I know > that performance comparisons with runtimes on the order of nanoseconds are > probably not too meaningful, but people still like absolute numbers, and > it's a bit surprising that I can't match the performance of Julia from C. > > > > Thanks, > > Miles >
