https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112615
Bug ID: 112615 Summary: gcc incorrectly assumes char *x[2]={"str1", "str2"} has 16-byte minimum alignment and generates SSE instructions (e.g. movaps) when accessing this data Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gandalf at winds dot org Target Milestone: --- I ran into the following problem while trying to get Oracle 21c to run on Gentoo OS under glibc-2.38 and GCC 13.2 on x86-64 with SSE instructions enabled. glibc-2.38's time/tzset.c file (see https://github.com/bminor/glibc/blob/master/time/tzset.c) has the following non-static declaration on line 31: char *__tzname[2] = { (char *) "GMT", (char *) "GMT" }; GCC wrongly assumes that this variable __tzname has a minimum alignment of 16 bytes instead of 8. GCC thus generates the following assembly instructions for this portion of __tzset_parse_tz() when compiling with -march=x86-64: 327 /* Get the standard time zone abbreviations. */ 328 if (parse_tzname (&tz, 0) && parse_offset (&tz, 0)) 0x0000000000000a6f <+63>: movups %xmm0,0x0(%rip) # 0xa76 <__tzset_parse_tz+70> 0x0000000000000a76 <+70>: movups %xmm0,0x0(%rip) # 0xa7d <__tzset_parse_tz+77> 0x0000000000000a7d <+77>: movups %xmm0,0x0(%rip) # 0xa84 <__tzset_parse_tz+84> 0x0000000000000a84 <+84>: movups %xmm0,0x0(%rip) # 0xa8b <__tzset_parse_tz+91> 0x0000000000000a8b <+91>: call 0x440 <parse_tzname> 0x0000000000000a90 <+96>: test %al,%al 0x0000000000000a92 <+98>: jne 0xac8 <__tzset_parse_tz+152> 0x0000000000000a94 <+100>: movq 0x0(%rip),%xmm0 # 0xa9c <__tzset_parse_tz+108> 132 __tzname[1] = (char *) tz_rules[1].name; 0x0000000000000a9c <+108>: xor %eax,%eax 0x0000000000000a9e <+110>: xor %edx,%edx 0x0000000000000aa0 <+112>: pinsrq $0x1,0x0(%rip),%xmm0 # 0xaab <__tzset_parse_tz+123> 129 __daylight = tz_rules[0].offset != tz_rules[1].offset; 0x0000000000000aab <+123>: mov %edx,0x0(%rip) # 0xab1 <__tzset_parse_tz+129> 130 __timezone = -tz_rules[0].offset; 0x0000000000000ab1 <+129>: mov %rax,0x0(%rip) # 0xab8 <__tzset_parse_tz+136> 131 __tzname[0] = (char *) tz_rules[0].name; 132 __tzname[1] = (char *) tz_rules[1].name; 0x0000000000000ab8 <+136>: movaps %xmm0,0x0(%rip) # 0xabf <__tzset_parse_tz+143> In the above, line 131 and 132 are combined into a "movaps" instruction that requires 16-byte alignment to work properly. However, if a C program is compiled with a variable called __tzname that is not 16-byte aligned (due to the fact that char* only requires 8-byte alignment), and this is then linked to glibc (causing the locally defined __tzname to override the one declared in glibc), and the if(parse_tzname()) check on line 328 fails due to an invalid TZ environment variable setting (such as is the case when using Oracle 21c on Gentoo), the movaps instruction above causes a segmentation fault. Here is an example test.c C program: #include <stdio.h> #include <time.h> /* Specifically align __tzname to a non-16-byte boundary */ __attribute__((aligned(8))) char *__tzname[2]={"GMT", "GMT"}; char *x="xx"; // This is here to take up the first 8 bytes in .data int main() { struct tm tm={}; printf("%ld\n", mktime(&tm)); return 0; } $ gcc -O3 -march=x86-64 test.c -o test -Wall -ggdb3 $ nm test | grep __tzname 0000000000404028 D __tzname $ ./test -2209057200 $ TZ=xx ./test Segmentation fault (core dumped) Removing the __attribute__((aligned(8))) from the test.c program, as follows, causes the following change: #include <stdio.h> #include <time.h> /* GCC now aligns __tzname to 16 bytes */ char *__tzname[2]={"GMT", "GMT"}; char *x="xx"; int main() { struct tm tm={}; printf("%ld\n", mktime(&tm)); return 0; } $ gcc -O3 -march=x86-64 test.c -o test -Wall -ggdb3 $ nm test | grep __tzname 0000000000404030 D __tzname $ ./test -2209057200 $ TZ=xx ./test -2209057200 In the examples above, the "x" variable is used to consume 8 bytes in .data, so that the next available address for "__tzname" is 0x404028. Assuming a minimum alignment of 16 for __tzname only makes sense when you're either compiling the whole program or when __tzname is static, but GCC should not do this when the variable is non-static (as is the case when tzset.o is compiled inside the glibc source package). I should clarify that optimizing the variable's address to use 16-byte alignment can be ideal for data storage vs cache-line boundaries and so this optimization should likely remain. But the instructions acting on this data area must assume an 8-byte minimum alignment here, not 16.