On 7/19/07, Richard Guenther <[EMAIL PROTECTED]> wrote:
Of course, if any then the array indexing variant is fixed. It would be nice
to see a complete testcase with a pessimization, maybe you can file
a bugreport about this?
There's many issues for all alternatives and i'm not qualified to
pinpoint them further.
I've taken http://ompf.org/ray/sphereflake/ which is used as a
benchmark already here
http://www.suse.de/~gcctest/c++bench/raytracer/, because it's small,
self contained and has such a basic 3 component class that's used all
over.
It doesn't use any kind of array access operator, but it's good enough
to show the price one has to pay before even thinking of providing
some. It has been adjusted to use floats and access members through
accessors (to allow for a straighter comparison of all cases).
variation 0 is the reference, a mere struct { float x,y,z; ...};,
performs as good as the original, but wouldn't allow for any 'valid'
indexing.
variation 1 is struct { float f[3]; ... }
variations 2,3,4,5 try to use some union
# /usr/local/gcc-4.3-20070720/bin/g++ -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/usr/local/gcc-4.3-20070720
--enable-languages=c,c++ --enable-threads=posix --disable-checking
--disable-nls --disable-shared --disable-win32-registry
--with-system-zlib --disable-multilib --verbose --with-gcc=gcc-4.2
--with-gnu-ld --with-gnu-as --enable-checking=none --disable-bootstrap
Thread model: posix
gcc version 4.3.0 20070720 (experimental)
# make bench
[snip]
sf.v0
real0m3.963s
user0m3.812s
sys 0m0.152s
sf.v1
real0m3.972s
user0m3.864s
sys 0m0.104s
sf.v2
real0m10.384s
user0m10.261s
sys 0m0.120s
sf.v3
real0m10.390s
user0m10.289s
sys 0m0.104s
sf.v4
real0m10.388s
user0m10.265s
sys 0m0.124s
sf.v5
real0m10.399s
user0m10.281s
sys 0m0.116s
There's some inlining difference between union variations and the
first two, but they clearly stand in their own league anyway.
So we can only seriously consider the first two.
Variation #0 would ask for invalid c++ (pointer arithmetic abuse, not
an option anymore) or forbidding array access operator and going to
set/get + memcpy, but pretty optimal.
Variation #1 (straight array) is quite annoying in C++ (no initializer
list, need to reformulate all access etc...) and already show some
slight pessimization, but it's not easy to track. Apparently g++ got a
bit better lately in this regard, or it's only blatant on larger data
or more complex cases.
I hope this shows how problematic it is for the end user.
// sphere flake bvh raytracer (c) 2005, thierry berger-perrin <[EMAIL PROTECTED]>
// this code is released under the GNU Public License.
// see http://ompf.org/ray/sphereflake/
// compile with ie g++ -O2 -ffast-math sphereflake.cc
// usage: ./sphereflake [lvl=6] >pix.ppm
#include
#include
#include
#include
#define GIMME_SHADOWS
enum { childs = 9, ss= 2, ss_sqr = ss*ss }; /* not really tweakable anymore */
static const float infinity = std::numeric_limits::infinity(), epsilon = 1e-4f;
#if VARIATION == 5
union v_t {
// straight union; array left unharmed; just as horrible as the others.
struct { float _x, _y, _z; };
float f[3];
v_t(const float a, const float b, const float c) : _x(a), _y(b), _z(c) {}
float x() const { return _x; }
float &x() { return _x; }
float y() const { return _y; }
float &y() { return _y; }
float z() const { return _z; }
float &z() { return _z; }
#else
struct v_t {
#endif
#if VARIATION == 0
// best of the breed, but doesn't give way for an 'array access' operator.
float _x, _y, _z;
v_t(const float a, const float b, const float c) : _x(a), _y(b), _z(c) {}
float x() const { return _x; }
float &x() { return _x; }
float y() const { return _y; }
float &y() { return _y; }
float z() const { return _z; }
float &z() { return _z; }
#elif VARIATION == 1
// not as good, obvious 'array access' but forbids initializer lists
float f[3];
v_t(const float a, const float b, const float c) { f[0] = a; f[1] = b; f[2] = c; }
float x() const { return f[0]; }
float &x() { return f[0]; }
float y() const { return f[1]; }
float &y() { return f[1]; }
float z() const { return f[2]; }
float &z() { return f[2]; }
#elif VARIATION == 2
// Richard Guenther's suggestion, worst of the worst.
union {
struct { float x, y, z; } a;
float b[3];
} u;
v_t(const float i, const float j, const float k) { u.a.x = i; u.a.y = j; u.a.z = k; }
float x() const { return u.a.x; }
float &x() { return u.a.x; }
float y() const { return u.a.y; }
float &y() { return u.a.y; }
float z() const { return u.a.z; }
float &z() { return u.a.z; }
#elif VARIATION == 3
// slightly better than variation #2, but still terrible.
union {
struct { float _x, _y, _z; };
float f[3];
};
v_t(const float a, const float b, const float c) : _x(a), _y(b), _z(c) {}
float x() c