Re: [RFC] diff-optimizations-bytes branch: avoiding function call overhead (?)

Peter Samuelson Tue, 14 Dec 2010 15:33:05 -0800

[Johan Corveleyn]
> As an experiment I changed these functions into macro's, eliminating
> the function calls. This makes the diff algorithm another 10% - 15%
> faster (granted, this was measured with my "extreme" testcase of a
> 1,5 Mb file (60000 lines), of which most lines are identical
> prefix/suffix).


Seems worth it.  That test case doesn't seem terribly uncommon.

> Some considerations:
> - Maybe I can use APR_INLINE, with similar results?
> 
> - Maybe I can put just the "critical section" into a macro (the curp++
> / curp-- part), and call a function when a chunk boundary is
> encountered (~ once every 131072 iterations (chunks are 128 Kb
> large)), to read in the new chunk, advancing variables, ...

The second option seems best to me, possibly combined with the first.
You don't want really big inline functions.  You probably can't have
them anyway, the compiler will decide to un-inline them behind your
back.  (And that's usually the right decision, for optimal CPU L1
instruction cache).

Also, if you're going to define a { } block in a macro, I think you
want to do the trick suggested by gcc docs: wrap your { } with
do...while(0).  There's no semantic change to the macro itself, but
consider cases like this:

   if (foo)
     increment_pointers(file, len, pool);
   else
     ...

Think about what the expansion looks like with and without the "do { }
while (0)", the difference should be clear.

-- 
Peter Samuelson | org-tld!p12n!peter | http://p12n.org/

Re: [RFC] diff-optimizations-bytes branch: avoiding function call overhead (?)

Reply via email to