I have completely changed the gegl/buffer/gegl-sampler-yafr.c code.
Before I put together the patch to the new yafr, I would like to see if I could
make the code even faster by using C99/gcc built-in math intrinsics. I have not
tried this yet.
The method used by the updated code is different from the first generation
yafr (at once softer and more pervasive; yes I know this is vague: what I mean
is that the nonlinear correction is "on" throughout more of the image, but that
its effect is never as extreme).
The code also runs even faster than before: on my current vintage laptop,
yafr scales up about 10% slower than gegl-sampler-linear, and about 10% faster
Regarding further speed-up: Using abs, fmin and copysign I could make my code
branch-free (assuming of course that these operations are translated to
built-ins by the compiler on the machine on which the code is compiled).
That is: the code, which right now contains no "if," no "for," no "do" and no
"while," would now contain no ?. I suspect that using arithmetic branching
could make my code run noticeably faster.
I noticed that fabs, fmin and copysign, or similar C99/gcc built-ins, are not
anywhere in the gegl source.
Is there a preferred/tolerated way of using such math functions in gegl?
Can I assume that gfloats are floats?
Can I assume that gdoubles are doubles?
Must I program with the possibility that gfloats be doubles?
Must I program with the possibility that gdoubles be floats?
Could gfloats or gdoubles be anything else than floats or doubles?
It may be that I can use the type-generic fabs, fmin and copysign on gfloats
a speed hit. Hopefully, gcc can use the correct one based on the fact that it
gfloats. If not, it may be that using the double versions on gfloats is still
than the alternatives.
If I KNEW for a fact that gfloat = float, I could simply use fabsf, fminf, and
I could do the necessary parts of the computation with doubles (or gdoubles)
use the double versions. Hopefully, this will not slow down gegl when run on
which is faster on floats than doubles (like some GPUs).
Is there a smarter way, which picks the right one?
Change compilation flags to include C99 built-ins?
You have another idea?
Or should I just stick to C90 gcc built-ins?
Laurentian University/Universite Laurentienne
Gimp-developer mailing list