On Sun, 2007-08-05 at 16:58 -0400, Tim Prince wrote: > [EMAIL PROTECTED] wrote:
[snip] > > The loop unfortunately can't always be written as in -DOLD as the > > implementation of an iterator adapter might use ?: to special case the > > first element of a sequence and when used in a generic algorithm which > > just has the simple loop of -DNEW it ought to be optimised like -DOLD if > > inlining occurs. > > > I don't see why you special case the first iteration of a loop with ? > inside the loop. Simply write the first iteration separately, and begin > the loop with the next iteration. It should be a lot clearer both to us > and to the compiler what is your intention. > Doesn't this belong on gcc-help? Better peeling optimization needs more > justification than this. As above. When using C++ as it's supposed to be used this isn't possible. If I've got a sequence of values and I want to sum them there is a generic sum algorithm that I'm supposed to use (and which I *should* use to avoid unmaintainable spaghetti code which soon turns up when every thing has to be hand coded to be fast - I've read "numerical recipes in C" and nearly killed myself by the end of it). If I've got a 1000000 element sequence and want (on one thread) to compare what would happen in my generic algorithm (such as std::sum) if the first element is doubled but (on a second thread) with the sequence as it is, I'm not supposed to be required to write two versions of a large piece of code - I'm supposed to be able to just write an iterator that returns a different value for the first element. In the case of returning a different value for the first element, I'd use something in operator* that after inlining would end up as equivalent to d?d:1 and the compiler should optimise that if it can and if it would reduce runtime by a massive 33% and text size by quite a bit. Both of those are true in this case. Basically, writing two versions of the loop is not an option in real life because developing on spaghetti C code is costly while elegant C++ code is cheap. For example, my real (big) case is that I've got an iterator that moves around an image or video thusly (or 4,5,6 dimensional equivalent): it = it[1] + 6; // move 6 rows along the 2nd dimension When I move in all dimensions at once I quite sensibly have a loop from dimension zero to the top dimension adding (amount-to-move*stride-of-dimension) to the pointer that the iterator is implemented with for each one. The strides of the dimensions are previously recorded in an array that the iterator holds a reference to - except the first (because when you have a one dimensional image - a signal - nothing needs to be stored) which is always 1. So everything gets reduced down correctly by g++, except that this special case is not moved out of the loop. So this is not a programming problem that I need help with (I know how to micro-optimise with C-style ugly-stuff). I'm just reporting a significant missed optimisation opportunity that will help C++ developers even if not C or fortran developers. The only reason I didn't reflect the C++ simple-vs-spaghetti code issue in my example is because I wanted to keep it simple to target one of the problems very specifically. -- Tristan Wibberley Any opinion expressed is mine (or else I'm playing devils advocate for the sake of a good argument). My employer had nothing to do with this communication.