https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124684
Bug ID: 124684
Summary: Pathological timing for -O0 -fpie when an automatic
structure points to automatic data.
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: rdubner at gcc dot gnu.org
Target Milestone: ---
Created attachment 64070
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=64070&action=edit
Python script creates the C++ progam that shows the pathological behavior.
Compiling code with "-ggdb -O0 -fpie" where one automatic object stores
the address of another automatic object triggers pathological compile
time in thread pro- & epilogue. Passing -fno-pie or -O2 restores linear
behavior.
I am aware that in C/C++, functions with tens of thousands of lines of
code are rare. They are routine in COBOL source code, which is why I
encountered this behavior.
The compiler used was GCC-16, configured with
--enable-languages=cobol --disable-multilib --disable-bootstrap
The timings below are in seconds. The computer was an X86_64 running
at 3 to 5 gigaHertz.
compiled with -ggdb -O0 -fpie
struct struct
N no struct no data with data
10,000 1.4 1.3 7.9
20,000 3.1 3.1 56.2
40,000 7.4 7.4 309.0
80,000 16.4 16.6
160,000 40.3 40.8
The significant lines from -ftime-report-details on the 56-second compilation:
Time variable wall GGC
phase parsing : 0.36 ( 1%) 139M ( 45%)
phase opt and generate : 55.62 ( 99%) 167M ( 54%)
thread pro- & epilogue : 53.08 ( 95%) 1728 ( 0%)
`- CFG verifier : 0.03 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.01 ( 0%) 0 ( 0%)
`- cfg cleanup : 0.01 ( 0%) 0 ( 0%)
`- df live regs : 0.01 ( 0%) 0 ( 0%)
TOTAL : 56.00 308M
Extra diagnostic checks enabled; compiler may run slowly.
Here are the linear timings for -fno-pie and -O2 compilations:
compiled with -ggdb -O0 -fno-pie
struct struct
N no struct no data with data
10,000 1.4 1.4 1.4
20,000 3.2 3.2 3.1
40,000 7.1 7.3 7.3
80,000 16.5 16.7 16.7
60,000 40.2 40.7 41.4
compiled with -O2 -fpie
struct struct
N no struct no data with data
10,000 0.4 0.4 0.5
20,000 1.0 1.0 1.0
40,000 2.3 2.3 2.3
80,000 5.7 5.7 5.7
160,000 15.3 15.3 15.4
compiled with -O2 -fno-pie
struct struct
N no struct no data with data
10,000 0.5 0.5 0.5
20,000 1.0 1.0 1.0
40,000 2.3 2.4 2.4
80,000 5.7 5.7 5.6
160,000 15.3 15.3 15.4
This code template demonstrates the pathological behavior
~~~~~~~~~~~~~~~~~~~~~~
typedef struct cblc_field_t
{
unsigned char *data;
const char *name;
} cblc_field_t;
int
main()
{
// This is linear with -fno-pie, but worse than quadratic with -fpie
unsigned char funky_data0[16] = {};
cblc_field_t funky0 = {funky_data0, "funky_name"};
int x,y,z;
// The following is repeated N times
x=0;if(x&1)y=x; else z=x;
x=1;if(x&1)y=x; else z=x;
...
x=N-1;if(x&1)y=x; else z=x;
}
~~~~~~~~~~~~~~~~~~~~~~
The Python script I used to generate the C/C++ source code used
in creating those timing tables is attached.