--- Comment #1 from svfuerst at gmail dot com 2010-05-10 06:36 ---
A common technique is to benchmark a function by calling it many times i.e.
void foo(void)
{
/* foo's implementation */
}
int main(void)
{
int i;
for (i = 0; i LARGE_NUM; i++) foo();
return 0;
}
The
--- Comment #2 from steven at gcc dot gnu dot org 2010-05-10 06:55 ---
Re. comment #1:
(1) For this, there is the noinline attribute, as you already knew.
(2) See the noclone attribute
(3) See the regparm attribute
(4) You could use volatile and things like that, or put the unit in a
--- Comment #3 from rguenth at gcc dot gnu dot org 2010-05-10 09:13 ---
4) is already fine with noclone,noinline
for 3) you can add artificial side-effects by an empty asm();
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44053
--- Comment #4 from steven at gcc dot gnu dot org 2010-05-10 11:00 ---
In other words: not an issue.
--
steven at gcc dot gnu dot org changed:
What|Removed |Added
--- Comment #5 from svfuerst at gmail dot com 2010-05-10 14:53 ---
The problem is that the list of these workarounds tends to increase with each
release of gcc. (i.e. noclone was added in gcc 4.5) It would be nice if there
was a single attribute to use that would work with all future
--- Comment #6 from pinskia at gcc dot gnu dot org 2010-05-10 22:04 ---
Also I think it is a bad idea for having this kind of attribute. If your
benchmark can be optimized away, that is better for newer versions of the
compiler.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44053
--- Comment #7 from svfuerst at gmail dot com 2010-05-10 22:44 ---
Perhaps an example usage helps:
The __float128 version of isnan() is rather slow. Trying different
implmentations to see which is faster required some benchmarking. However,
implementing the benchmark code requires an
--- Comment #8 from pinskia at gcc dot gnu dot org 2010-05-10 22:49 ---
Anyway, the result of much benchmarking shows that:
Is it? It definitely moves from the x87 registers to the SSE registers which
can be slow. Micro benchmarks are not always true benchmarks. Also there are
--- Comment #9 from svfuerst at gmail dot com 2010-05-10 23:27 ---
Remember that isnan() is a weird type-dependent macro. The special case I was
testing is the __float128 version. __float128's are passed in sse registers,
so using sse instructions to manipulate them can be a win. (No