Stefan Fuhrmann wrote: > Julian Foad wrote: > > I saw in a later part of the patch that this is so you can make the > > function 'decode_instruction()' rely on those values. As you know, > > adding the explicit "=0", "=1", "=2" on the enum definition does not > > make any difference to the compiler, it just provides a hint to the > > human developers that these values might be significant. > > > > We need a stronger hint than this. Please add code comments in both the > > enum definition and the function 'decode_instruction()', saying "The > > enum values must match the instruction encoding values." > > > Oh! I wasn't aware of that - being a C++ guy.
You may have misunderstood me. The rules are the same in C and C++. What I meant is that the compiler will assign the values { 0, 1, 2 } regardless of whether they are specified explicitly. It only makes a difference when the next human maintainer wants to insert another value in the list ... and then the human sees the numbers and thinks "Hmm, if I need to insert a new value, perhaps I should make sure I don't re-number the old values when I insert this new value." > But as long as comments can "fix" that, so be it ... ;) [...] > >> (copy_source_ops): dito; optimize a common case > > > > This is optimizing use of the C 'modulo' operator ('A % B') by > > hand-coding B=1 and B=2 to use "& (B-1)" instead. Are you sure your C > > compiler doesn't already do this? My intuition expects the compiler to > > do it. > > > How would the compiler know that ptn_length is often > 1 or 2? In the general case, such conditional code would > be a gross pessimization. My intuition may well be wrong. My reasoning was that testing for 1 or 2 is very quick, and general case modulo calculation is very slow. Therefore I think adding a test to every modulo calculation cannot be a gross pessimization, just a small overhead. But if typical compilers don't do it, then your manual optimization is useful. > >> * subversion/libsvn_delta/svndiff.c > >> (decode_file_offset, decode_size): use slightly more > >> efficient formulations > > > > In these two functions you are expanding two lines of expressions into > > many simpler lines, and introducing a temporary variable to eliminate > > multiple loads and stores. > > > > I am suspicious of this change. It is the compiler's job to transform > > an expression into a longer stream of simple statements, and often the > > compiler will do a better job than a human. (Yes I know hand-coded > > optimisations can sometimes be useful.) > > > Again, the compiler can't do much here: unless it summons > the daemons of global and feedback optimization (or the > user tells him not to care), there is no way to be *sure* that > p and val point to disjoint memory regions. Thus, it cannot > optimize the *val = f(*val) code. > > My code saves 2 out of 3 memory accesses at the expense > of an additional write at the end. OK, so the temporary variable is useful. Is splitting the expression up into many small statements also useful? > > Also, these changes (and the 'modulo' change above) look like the sort > > of change that could generate faster code on one compiler but slower > > code on another. > > > I'm very conscious about the impact on other compilers > and architectures. Therefore, I restrict my extensive arsenal > of "coding tricks" to what should be beneficial on most > CPUs and is hard for compilers to "screw up". Thanks. > > If you could show the numbers, that would help me/us to understand the > > trade-off. > > > These changes were added in the later stages of my > optimization session. At that point, the major parts had > already gotten streamlined leaving other sections where > runtime seeped through cracks all over the place. > > Each of the changes "putties" one of the cracks I came > across. So, individual savings are in the .5 .. 2% range > but the total effect is somewhere between 3 and 10% > (depending on platform, compiler, etc.). Thanks. My problem is I find it hard to believe that some of these changes - Julian