https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66584

            Bug ID: 66584
           Summary: gcc differs in static, branch-prediction cost from icc
                    in switch.
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jmcguiness at liquidcapital dot com
  Target Milestone: ---

This means that code optimised for icc is sub-optimal for icc and the reverse
is true. I feel that this feature should be more clearly documented in the web
pages & man pages.

For the following code:

extern void bar1();
extern void bar2();
extern void bar3();

void foo(int i) {
  switch(i) {
  case 1:
    bar1(); // gcc: least likely | icc: most likely
    break;
  case 2:
    bar2(); // gcc: less likely | icc: less likely
    break;
  default:
    bar3(); // gcc: most likely | icc: least likely
  }
}

gcc v4.8.2 & v5.10 for -O2 & -O3 produce:

foo(int):
        cmpl    $1, %edi
        je      .L3
        cmpl    $2, %edi
        jne     .L8
        jmp     bar2()
.L8:
        jmp     bar3()
.L3:
        jmp     bar1()

>From which the static probabilities are quoted, above.

Conversely icc produces:

foo(int):
        cmpl      $1, %edi                                      #9.10
        jne       ..B1.3        # Prob 67%                      #9.10
        jmp       bar1()                                      #11.2
..B1.3:                         # Preds ..B1.1
        cmpl      $2, %edi                                      #9.10
        jne       ..B1.5        # Prob 50%                      #9.10
        jmp       bar2()                                      #14.5
..B1.5:                         # Preds ..B1.3
        jmp       bar3()                                      #17.5

>From which the static probabilities are quoted, above.

Please not: I feel this is *only* a bug in the documentation!

It would be nice if my optimised (for speed) code would be optimised (for
speed) on both platforms in the same way, so that I wouldn't have to optimise
my code, but that is only my taste.

Reply via email to