https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124684

            Bug ID: 124684
           Summary: Pathological timing for -O0 -fpie when an automatic
                    structure points to automatic data.
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rdubner at gcc dot gnu.org
  Target Milestone: ---

Created attachment 64070
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=64070&action=edit
Python script creates the C++ progam that shows the pathological behavior.

Compiling code with "-ggdb -O0 -fpie" where one automatic object stores
the address of another automatic object triggers pathological compile
time in thread pro- & epilogue. Passing -fno-pie or -O2 restores linear
behavior.

I am aware that in C/C++, functions with tens of thousands of lines of
code are rare.  They are routine in COBOL source code, which is why I
encountered this behavior.

The compiler used was GCC-16, configured with

--enable-languages=cobol --disable-multilib --disable-bootstrap

The timings below are in seconds.  The computer was an X86_64 running
at 3 to 5 gigaHertz.

compiled with -ggdb -O0 -fpie
                          struct     struct
      N    no struct     no data  with data
 10,000          1.4        1.3         7.9
 20,000          3.1        3.1        56.2
 40,000          7.4        7.4       309.0
 80,000         16.4       16.6
160,000         40.3       40.8

The significant lines from -ftime-report-details on the 56-second compilation:

Time variable                                  wall           GGC
 phase parsing                      :   0.36 (  1%)   139M ( 45%)
 phase opt and generate             :  55.62 ( 99%)   167M ( 54%)
 thread pro- & epilogue             :  53.08 ( 95%)  1728  (  0%)
 `- CFG verifier                    :   0.03 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.01 (  0%)     0  (  0%)
 `- cfg cleanup                     :   0.01 (  0%)     0  (  0%)
 `- df live regs                    :   0.01 (  0%)     0  (  0%)
 TOTAL                              :  56.00          308M
Extra diagnostic checks enabled; compiler may run slowly.

Here are the linear timings for -fno-pie and -O2 compilations:

compiled with -ggdb -O0 -fno-pie
                          struct     struct
      N    no struct     no data  with data
 10,000         1.4        1.4         1.4
 20,000         3.2        3.2         3.1
 40,000         7.1        7.3         7.3
 80,000        16.5       16.7        16.7
 60,000        40.2       40.7        41.4

compiled with -O2 -fpie
                          struct     struct
      N    no struct     no data  with data
 10,000           0.4        0.4        0.5
 20,000           1.0        1.0        1.0
 40,000           2.3        2.3        2.3
 80,000           5.7        5.7        5.7
160,000          15.3       15.3       15.4

compiled with -O2 -fno-pie
                          struct     struct
      N    no struct     no data  with data
 10,000         0.5        0.5         0.5
 20,000         1.0        1.0         1.0
 40,000         2.3        2.4         2.4
 80,000         5.7        5.7         5.6
160,000        15.3       15.3        15.4

This code template demonstrates the pathological behavior

~~~~~~~~~~~~~~~~~~~~~~
typedef struct cblc_field_t
  {
  unsigned char *data;
  const char    *name;
  } cblc_field_t;

int
main()
  {
  // This is linear with -fno-pie, but worse than quadratic with -fpie
  unsigned char funky_data0[16] = {};
  cblc_field_t funky0 = {funky_data0, "funky_name"};

  int x,y,z;

  // The following is repeated N times
  x=0;if(x&1)y=x; else z=x;
  x=1;if(x&1)y=x; else z=x;
  ...
  x=N-1;if(x&1)y=x; else z=x;
  }
~~~~~~~~~~~~~~~~~~~~~~

The Python script I used to generate the C/C++ source code used
in creating those timing tables is attached.

Reply via email to