[Bug ipa/116333] unused result of pure function is not optimized out because of inlining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116333 --- Comment #3 from Pali Rohár --- Hello Andrew, have you checked if this is really duplicate?
[Bug ipa/116333] unused result of pure function is not optimized out because of inlining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116333 --- Comment #2 from Pali Rohár --- Hello Andrew, you have wrote that this function is not optimized out because of inlining. I tried to mark _winshowcmd() function with __attribute__((noinline)) but result is similar. Function _winshowcmd() is present in the final executable but it is not called at all. It is really a duplicate? $ objdump -d pure.exe pure.exe: file format pei-i386 Disassembly of section .text: 00401000 <__winshowcmd>: 401000: 55 push %ebp 401001: 31 c0 xor%eax,%eax 401003: b9 11 00 00 00 mov$0x11,%ecx 401008: 89 e5 mov%esp,%ebp 40100a: 57 push %edi 40100b: 8d 7d b4lea-0x4c(%ebp),%edi 40100e: 83 ec 64sub$0x64,%esp 401011: f3 ab rep stos %eax,%es:(%edi) 401013: 8d 45 b4lea-0x4c(%ebp),%eax 401016: 89 04 24mov%eax,(%esp) 401019: ff 15 4c 30 40 00 call *0x40304c 40101f: 50 push %eax 401020: b8 0a 00 00 00 mov$0xa,%eax 401025: f6 45 e0 01 testb $0x1,-0x20(%ebp) 401029: 74 04 je 40102f <__winshowcmd+0x2f> 40102b: 0f b7 45 e4 movzwl -0x1c(%ebp),%eax 40102f: 8b 7d fcmov-0x4(%ebp),%edi 401032: c9 leave 401033: c3 ret 00401034 <___main>: 401034: c3 ret 00401035 <_WinMainCRTStartup>: 401035: 55 push %ebp 401036: b8 01 00 00 00 mov$0x1,%eax 40103b: 89 e5 mov%esp,%ebp 40103d: 83 ec 18sub$0x18,%esp 401040: 8d 55 f0lea-0x10(%ebp),%edx 401043: c7 45 f0 0e 20 40 00movl $0x40200e,-0x10(%ebp) 40104a: c7 45 f4 00 00 00 00movl $0x0,-0xc(%ebp) 401051: e8 12 00 00 00 call 401068 <_main> 401056: c9 leave 401057: c3 ret 00401058 <_GetStartupInfoA@4>: 401058: ff 25 4c 30 40 00 jmp*0x40304c 40105e: 90 nop 40105f: 90 nop 00401060 <_MessageBoxA@16>: 401060: ff 25 54 30 40 00 jmp*0x403054 401066: 90 nop 401067: 90 nop 00401068 <_main>: 401068: 57 push %edi 401069: 8d 7c 24 08 lea0x8(%esp),%edi 40106d: 83 e4 f0and$0xfff0,%esp 401070: ff 77 fcpushl -0x4(%edi) 401073: 55 push %ebp 401074: 89 e5 mov%esp,%ebp 401076: 57 push %edi 401077: 83 ec 14sub$0x14,%esp 40107a: e8 b5 ff ff ff call 401034 <___main> 40107f: c7 44 24 0c 00 00 00movl $0x0,0xc(%esp) 401086: 00 401087: c7 44 24 08 00 20 40movl $0x402000,0x8(%esp) 40108e: 00 40108f: c7 44 24 04 06 20 40movl $0x402006,0x4(%esp) 401096: 00 401097: c7 04 24 00 00 00 00movl $0x0,(%esp) 40109e: ff 15 54 30 40 00 call *0x403054 4010a4: 8b 7d fcmov-0x4(%ebp),%edi 4010a7: 31 c0 xor%eax,%eax 4010a9: 83 ec 10sub$0x10,%esp 4010ac: c9 leave 4010ad: 8d 67 f8lea-0x8(%edi),%esp 4010b0: 5f pop%edi 4010b1: c3 ret 4010b2: 90 nop 4010b3: 90 nop 004010b4 <__CTOR_LIST__>: 4010b4: ff (bad) 4010b5: ff (bad) 4010b6: ff (bad) 4010b7: ff 00 incl (%eax) 4010b9: 00 00 add%al,(%eax) ... 004010bc <__DTOR_LIST__>: 4010bc: ff (bad) 4010bd: ff (bad) 4010be: ff (bad) 4010bf: ff 00 incl (%eax) 4010c1: 00 00 add%al,(%eax) ...
[Bug lto/116334] New: LTO dllimport generates ureferenced symbol and unused code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116334 Bug ID: 116334 Summary: LTO dllimport generates ureferenced symbol and unused code Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- When compiling single source application with calls external dllimport function then LTO generates unreferenced symbol and unused dead code in the final binary. When same source is compiled without LTO then there is no dead code in binary. Take following example: __declspec(dllimport) extern int __stdcall MessageBoxA(void *, const char *, const char *, unsigned int); __attribute__((used)) int WinMainCRTStartup(void) { MessageBoxA((void *)0, "Message", "Title", 0); return 0; } And compile it without LTO: $ i686-w64-mingw32-gcc unreferenced.c -o unreferenced.exe -mwindows -nostartfiles -nostdlib -luser32 -Wl,--disable-runtime-pseudo-reloc -W -Wall -Os Compiled binary under objdump looks like: $ objdump -d unreferenced.exe unreferenced.exe: file format pei-i386 Disassembly of section .text: 00401000 <_WinMainCRTStartup>: 401000: 55 push %ebp 401001: 31 c0 xor%eax,%eax 401003: 31 d2 xor%edx,%edx 401005: 89 e5 mov%esp,%ebp 401007: 83 ec 18sub$0x18,%esp 40100a: 89 44 24 0c mov%eax,0xc(%esp) 40100e: c7 44 24 08 00 20 40movl $0x402000,0x8(%esp) 401015: 00 401016: c7 44 24 04 06 20 40movl $0x402006,0x4(%esp) 40101d: 00 40101e: 89 14 24mov%edx,(%esp) 401021: ff 15 30 40 40 00 call *0x404030 401027: 31 c0 xor%eax,%eax 401029: 83 ec 10sub$0x10,%esp 40102c: c9 leave 40102d: c3 ret 40102e: 90 nop 40102f: 90 nop 00401030 <__CTOR_LIST__>: 401030: ff (bad) 401031: ff (bad) 401032: ff (bad) 401033: ff 00 incl (%eax) 401035: 00 00 add%al,(%eax) ... 00401038 <__DTOR_LIST__>: 401038: ff (bad) 401039: ff (bad) 40103a: ff (bad) 40103b: ff 00 incl (%eax) 40103d: 00 00 add%al,(%eax) ... Now compile it again but with enabled LTO with additional -flto switch. objdump shows: $ objdump -d unreferenced.exe unreferenced.exe: file format pei-i386 Disassembly of section .text: 00401000 <_WinMainCRTStartup>: 401000: 55 push %ebp 401001: 31 c0 xor%eax,%eax 401003: 31 d2 xor%edx,%edx 401005: 89 e5 mov%esp,%ebp 401007: 83 ec 18sub$0x18,%esp 40100a: 89 44 24 0c mov%eax,0xc(%esp) 40100e: c7 44 24 08 00 20 40movl $0x402000,0x8(%esp) 401015: 00 401016: c7 44 24 04 06 20 40movl $0x402006,0x4(%esp) 40101d: 00 40101e: 89 14 24mov%edx,(%esp) 401021: ff 15 30 40 40 00 call *0x404030 401027: 31 c0 xor%eax,%eax 401029: 83 ec 10sub$0x10,%esp 40102c: c9 leave 40102d: c3 ret 40102e: 90 nop 40102f: 90 nop 00401030 <_MessageBoxA@16>: 401030: ff 25 30 40 40 00 jmp*0x404030 401036: 90 nop 401037: 90 nop 00401038 <__CTOR_LIST__>: 401038: ff (bad) 401039: ff (bad) 40103a: ff (bad) 40103b: ff 00 incl (%eax) 40103d: 00 00 add%al,(%eax) ... 00401040 <__DTOR_LIST__>: 401040: ff (bad) 401041: ff (bad) 401042: ff (bad) 401043: ff 00 incl (%eax) 401045: 00 00 add%al,(%eax) ... At address 0x401030 there is dead code (nothing references address 0x401030) and also unreferenced symbol _MessageBoxA@16.
[Bug lto/116333] New: unused result of pure function is not optimized out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116333 Bug ID: 116333 Summary: unused result of pure function is not optimized out Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- If some function is marked with __attribute__((pure)) and return value of this function call is not used at all then gcc could optimize out and completely drop calling this function. This kind of optimization does not happen for example below which uses 3 source files with LTO enabled compilation (so gcc should see inter-file calls). $ cat pure0.c extern int main(); __attribute__((used)) int WinMainCRTStartup(void) { char * argv[] = { "argv0", (char *)0 }; return main(1, argv); } __attribute__((used)) void __main(void) {} $ cat pure1.c #define STARTF_USESHOWWINDOW 0x0001 #define SW_SHOWDEFAULT 10 typedef struct _STARTUPINFOA { unsigned int cb; char* lpReserved; char* lpDesktop; char* lpTitle; unsigned int dwX; unsigned int dwY; unsigned int dwXSize; unsigned int dwYSize; unsigned int dwXCountChars; unsigned int dwYCountChars; unsigned int dwFillAttribute; unsigned int dwFlags; unsigned short wShowWindow; unsigned short cbReserved2; unsigned char* lpReserved2; void* hStdInput; void* hStdOutput; void* hStdError; } STARTUPINFOA, *LPSTARTUPINFOA; __declspec(dllimport) extern void __stdcall GetStartupInfoA(LPSTARTUPINFOA); __attribute__((pure)) static int _winshowcmd (void) { STARTUPINFOA StartupInfo = {}; GetStartupInfoA(&StartupInfo); if (StartupInfo.dwFlags & STARTF_USESHOWWINDOW) return StartupInfo.wShowWindow; else return SW_SHOWDEFAULT; } extern int __stdcall WinMain(void *, void *, char *, int); extern unsigned char __ImageBase[]; int main() { return WinMain(&__ImageBase, (void *)0, "argv0", _winshowcmd()); } $ cat pure2.c __declspec(dllimport) extern int __stdcall MessageBoxA(void *, const char *, const char *, unsigned int); int __stdcall WinMain(void *instance __attribute__((unused)), void *prev_instance __attribute__((unused)), char *cmdln __attribute__((unused)), int showcmd __attribute__((unused))) { MessageBoxA((void *)0, "Message", "Title", 0); return 0; } Two additional helper files are needed to have example self-contained without any external dependency and also executable on windows: $ cat kernel32.def LIBRARY "kernel32.dll" EXPORTS GetStartupInfoA@4 $ cat user32.def LIBRARY "user32.dll" EXPORTS MessageBoxA@16 Compile these sources as: $ i686-w64-mingw32-dlltool -d kernel32.def -k -l libkernel32.a $ i686-w64-mingw32-dlltool -d user32.def -k -l libuser32.a $ i686-w64-mingw32-gcc pure0.c pure1.c pure2.c -o pure.exe -mwindows -nostartfiles -nostdlib libkernel32.a libuser32.a -Wl,--disable-runtime-pseudo-reloc -W -Wall -Os -flto objdump on compiled pure.exe shows: $ objdump -d pure.exe pure.exe: file format pei-i386 Disassembly of section .text: 00401000 <_WinMainCRTStartup>: 401000: 55 push %ebp 401001: 31 c0 xor%eax,%eax 401003: b9 11 00 00 00 mov$0x11,%ecx 401008: 89 e5 mov%esp,%ebp 40100a: 57 push %edi 40100b: 8d 7d b4lea-0x4c(%ebp),%edi 40100e: 83 ec 64sub$0x64,%esp 401011: f3 ab rep stos %eax,%es:(%edi) 401013: 8d 45 b4lea-0x4c(%ebp),%eax 401016: 89 04 24mov%eax,(%esp) 401019: ff 15 4c 40 40 00 call *0x40404c 40101f: 31 d2 xor%edx,%edx 401021: 31 c9 xor%ecx,%ecx 401023: 50 push %eax 401024: 89 54 24 0c mov%edx,0xc(%esp) 401028: c7 44 24 08 00 20 40movl $0x402000,0x8(%esp) 40102f: 00 401030: c7 44 24 04 06 20 40movl $0x402006,0x4(%esp) 401037: 00 401038: 89 0c 24mov%ecx,(%esp) 40103b: ff 15 54 40 40 00 call *0x404054 401041: 8b 7d fcmov-0x4(%ebp),%edi 401044: 31 c0 xor%eax,%eax 401046: 83 ec 10sub$0x10,%esp 401049: c9 leave 40104a: c3 ret 0040104b <___main>: 40104b: c3 ret 0040104c <_GetStartupInfoA@4>: 40104c: ff 25 4c 40 40 00 jmp*0x40404c 401052: 90 nop 401053: 90 nop 00401054 <_MessageBoxA@16>: 401054: ff 25 54 40 40 00 jmp*0x404054 40105a: 90 nop 40105b: 90
[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 --- Comment #9 from Pali Rohár --- (In reply to peter0x44 from comment #7) > 5) windres --help has this list of "supported targets": > x86_64-w64-mingw32-windres: supported targets: pe-x86-64 pei-x86-64 > pe-bigobj-x86-64 elf64-x86-64 pe-i386 pei-i386 elf32-i386 elf32-iamcu > elf64-little elf64-big elf32-little elf32-big srec symbolsrec verilog tekhex > binary ihex plugin I reported this particular issue into the binutils bugzilla: https://sourceware.org/bugzilla/show_bug.cgi?id=31543
[Bug middle-end/114449] bswap64 not optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 --- Comment #3 from Pali Rohár --- Note that clang optimizes it just with -O2 and does not require any special pragma.
[Bug middle-end/114449] bswap64 not optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 --- Comment #2 from Pali Rohár --- Interesting... I was expecting that some -O3 or better -Ofast option tells gcc to optimize the code as much as possible. I added that pragma before for-loop in the first example and then gcc really optimized the code to just bswap instruction.
[Bug middle-end/114449] New: bswap64 not optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 Bug ID: 114449 Summary: bswap64 not optimized Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- https://godbolt.org/z/dc3br9dYT gcc 13.2 with -O3 does not detect straightforward code for bswap64 functionality. It generates unoptimized code. uint64_t bswap64_1(uint64_t num) { uint64_t ret = 0; for (size_t i = 0; i < sizeof(num); i++) { ret |= ((num >> (8*(sizeof(num)-1-i))) & 0xff) << (8*i); } return ret; } Rewriting the code to manually unpack the loop cause that gcc produces optimized code with single "bswap" instruction on x86-64. uint64_t bswap64_2(uint64_t num) { uint64_t ret = 0; ret |= (((num >> 56) & 0xff) << 0); ret |= (((num >> 48) & 0xff) << 8); ret |= (((num >> 40) & 0xff) << 16); ret |= (((num >> 32) & 0xff) << 24); ret |= (((num >> 24) & 0xff) << 32); ret |= (((num >> 16) & 0xff) << 40); ret |= (((num >> 8) & 0xff) << 48); ret |= (((num >> 0) & 0xff) << 56); return ret; } Additional -funroll-all-loops argument for the first example does not help and still produces unoptimized code.
[Bug middle-end/114448] New: Roundup not optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114448 Bug ID: 114448 Summary: Roundup not optimized Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- https://godbolt.org/z/4fPKGzs1M Straightforward code which round up unsigned number to the next multiply of 4 is: (num % 4 == 0) ? num : num + (4 - num % 4); gcc -O2 generates: mov edx, edi mov eax, edi and edx, -4 add edx, 4 testdil, 3 cmovne eax, edx ret This is not optimal and branch/test can be avoided by using double modulo: num + (4 - num % 4) % 4; for which gcc -O2 generates: mov eax, edi neg eax and eax, 3 add eax, edi ret Optimal implementation for round up 4 is using bithacks: (num + 3) & ~3; for which gcc -O2 generates: lea eax, [rdi+3] and eax, -4 ret
[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 --- Comment #8 from Pali Rohár --- Thank you for input, as you already figured out there is lot of work for this. And I think I'm not skilled enough to implement everything properly, so I would have to let this to gcc developers. I will answer questions: > 1) should gcc pass through any arguments to windres? > -I --include-dir= Include directory when preprocessing rc file > -D --define [=]Define SYM when preprocessing rc file > -U --undefine Undefine SYM when preprocessing rc file windres's -I -D and -U are used for when processing "rc" file. So yes, gcc should propagate -I -D and -U to windres for "rc" files (but not to "res" files). > 2) does -m32 or -m64 need handling in any specific ways? This is really a good question and I totally forgot about this. Because gcc's -m32 generates coff for different arch that gcc's -m64, it means that -m32/-m64 switches has to be propagated to windres. I think that gcc's -m32 and -m64 should be "converted" to windres's --target= option (with the correct argument). > 3) the linker has -Wl, for passing arguments to it, does windres need an > equivalent? I think that it is not needed at all because all windres's flags should already have some options in gcc. > 4) windres --help says: > FORMAT is one of rc, res, or coff, and is deduced from the file name > should ".res" be handled too? "rc" is text resource format, "res" is the binary resource format. "coff" is PE/COFF object file format with binary resource. windres has option -J which explicitly sets the input format (and then extension is not used for deduction). So I think that gcc driver should have rules for both text (rc) and binary (res) formats. And in my "test.spec" experiment are rules for both formats. > 5) windres --help has this list of "supported targets": > x86_64-w64-mingw32-windres: supported targets: pe-x86-64 pei-x86-64 > pe-bigobj-x86-64 elf64-x86-64 pe-i386 pei-i386 elf32-i386 elf32-iamcu > elf64-little elf64-big elf32-little elf32-big srec symbolsrec verilog tekhex > binary ihex plugin > > Do they matter? I did not expect to see any "elf" on this list, because > windows obviously doesn't use it. This is for sure bug. ELF does not support embedding windows resource files. Windows resources can be embedded only into PE/COFF image file or into PE/COFF object file. And AFAIK, windres supports parsing both PE/COFF image and object files, but can generate only PE/COFF object file. So windres target list for sure contains non-senses and that is also reason why you got those errors when you specified ELF. > 6) does llvm-windres need to be considered at all? should there be a way to > select it? an -fuse-rc= command option or so? GNU windres is part of the binutils, where is also GNU as. So if the gcc is using GNU as from binutils for assembling then it should use also GNU windres from binutils for processing resources. So in my own opinion, usage of "windres" from gcc should be handled in the same way as usage of "as" from gcc. If gcc has a way to specify its own as binary, then it makes sense to allow specifying its own windres binary. But gcc developers can have different opinion.
[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 --- Comment #5 from Pali Rohár --- Thank you for info, I read that blog post and based on those details I adjusted spec file $ x86_64-w64-mingw32-gcc -dumpspecs > test.spec by adding additional lines to test.spec: .rc: x86_64-w64-mingw32-windres -J rc -O coff -i %i %{c:%W{o*}%{!o*:-o %w%b%O}}%{!c:-o %d%w%u%O} .res: x86_64-w64-mingw32-windres -J res -O coff -i %i %{c:%W{o*}%{!o*:-o %w%b%O}}%{!c:-o %d%w%u%O} rc files contains resources in text format and res files in binary format. With these changes x86_64-w64-mingw32-gcc was able to take both .c and .rc file on the input and produce .exe file with resource. $ cat test.c int main() { return 0; } $ cat test.rc 1 VERSIONINFO BEGIN END $ x86_64-w64-mingw32-gcc -specs=test.spec test.c test.rc -o test.exe Now show resource stored in test.exe: $ x86_64-w64-mingw32-windres -O rc test.exe /dev/stdout /* Type: version Name: 1. */ LANGUAGE 9, 1 1 VERSIONINFO BEGIN END Also replacing text test.rc file by binary test.res file works. There is one problem with it. I had to "hardcode" x86_64-w64-mingw32-windres name instead of just "windres". How to declare cross compile prefix? Because gcc somehow for "as" automatically adds it as in spec file is just "as", not "x86_64-w64-mingw32-as".
[Bug target/109317] -Os generates bigger code than -O2 on 32-bit ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109317 --- Comment #3 from Pali Rohár --- Do you need some more input or test data about this issue?
[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 --- Comment #2 from Pali Rohár --- Andrew, I do not know what is gcc driver nor what to do for it. But if you can show me some pointers, I can try it. Or if you need more details about files, usage, etc... please let me know.
[Bug target/108849] __declspec(code_seg("segname")) does not work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108849 --- Comment #3 from Pali Rohár --- Arsen, so based on my understooding (please correct me if I'm wrong), gcc's "section" can be used on both code (functions) and data (global variables). And ms's "code_seg" can be used only on code (functions). So if gcc adds __declspec(code_seg("segname")) as alias to __declspec(section("segname")) for TARGET_DECLSPEC then it should be OK for valid source code. However it does not throws an compile error if __declspec(code_seg("segname")) is specified on data. But I think it is acceptable. Primary motivation is support for compiling valid source code. Are you able to add this alias?
[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 --- Comment #8 from Pali Rohár --- Thanks for quick response and fixup of this issue.
[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 --- Comment #3 from Pali Rohár --- For details, here is the compiler which produces the mentioned code: $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 12.2.0-14' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-12 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (Debian 12.2.0-14) I guess that with these configure options you should be able to compile gcc which produces the mentioned code.
[Bug target/114319] New: htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 Bug ID: 114319 Summary: htobe64-like function is not optimized on 32-bit x86 Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- Target: x86 Here is very simple and straightforward implementation of htobe64 function which takes 64-bit number stored in unsigned long long variable and encodes it into byte buffer unsigned char[]. void test1(unsigned long long val, unsigned char *buf) { buf[0] = val >> 56; buf[1] = val >> 48; buf[2] = val >> 40; buf[3] = val >> 32; buf[4] = val >> 24; buf[5] = val >> 16; buf[6] = val >> 8; buf[7] = val; } Compiling it for 64-bit x86 via "gcc -m64 -O2" produces optimized code: : 0: 48 0f cfbswap %rdi 3: 48 89 3emov%rdi,(%rsi) 6: c3 retq But compiling it for 32-bit x86 via "gcc -m32 -O2" produces not so optimized code: : 0: 8b 54 24 08 mov0x8(%esp),%edx 4: 8b 44 24 0c mov0xc(%esp),%eax 8: 89 d1 mov%edx,%ecx a: 88 70 02mov%dh,0x2(%eax) d: c1 e9 18shr$0x18,%ecx 10: 88 50 03mov%dl,0x3(%eax) 13: 88 08 mov%cl,(%eax) 15: 89 d1 mov%edx,%ecx 17: 8b 54 24 04 mov0x4(%esp),%edx 1b: c1 e9 10shr$0x10,%ecx 1e: 0f ca bswap %edx 20: 88 48 01mov%cl,0x1(%eax) 23: 89 50 04mov%edx,0x4(%eax) 26: c3 ret I tried to compile it for 32-bit powerpc via "powerpc-linux-gnu-gcc -m32 -O2" and it produces optimized code: : 0: 90 65 00 00 stw r3,0(r5) 4: 90 85 00 04 stw r4,4(r5) 8: 4e 80 00 20 blr Same for 64-bit powerpc via "powerpc-linux-gnu-gcc -m64 -O2": <.test1>: 0: f8 64 00 00 std r3,0(r4) 4: 4e 80 00 20 blr As a next experiment I tried to rewrite the simple implementation to use gcc builtins. void test2(unsigned long long val, unsigned char *buf) { #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ val = __builtin_bswap64(val); #endif __builtin_memcpy(buf, &val, sizeof(val)); } If I compile it for 32-bit x86 then I get optimized code: 0030 : 30: 8b 4c 24 0c mov0xc(%esp),%ecx 34: 8b 44 24 04 mov0x4(%esp),%eax 38: 8b 54 24 08 mov0x8(%esp),%edx 3c: 0f c8 bswap %eax 3e: 89 41 04mov%eax,0x4(%ecx) 41: 0f ca bswap %edx 43: 89 11 mov%edx,(%ecx) 45: c3 ret If I compile it for 64-bit x86 then I get exactly same code as for test1: 0010 : 10: 48 0f cfbswap %rdi 13: 48 89 3emov%rdi,(%rsi) 16: c3 retq I tried to compile it for powerpc too and the result of test1 and test2 was same. So it looks like that the issue here is specific for 32-bit x86 and gcc does not detect that test1 function on x86 is doing bswap64. All tests I have done on (amd64) Debian gcc and for powerpc target I used Debian's powerpc-linux-gnu-gcc cross compiler.
[Bug target/108849] __declspec(code_seg("segname")) does not work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108849 --- Comment #2 from Pali Rohár --- `section` is the best option. MS says about it: https://learn.microsoft.com/en-us/cpp/cpp/code-seg-declspec > The code_seg declaration attribute names an executable text segment in the > .obj file in which the object code for the function or class member functions > is stored. > A segment is a named block of data in an .obj file that is loaded into memory > as a unit. A text segment is a segment that contains executable code. The > term section is often used interchangeably with segment. > By default, when no code_seg is specified, object code is put in a segment > named .text.
[Bug target/108851] gcc -pie generates unwanted PE export table
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108851 Pali Rohár changed: What|Removed |Added See Also|https://sourceware.org/bugz |https://sourceware.org/bugz |illa/show_bug.cgi?id=30004 |illa/show_bug.cgi?id=30922 --- Comment #4 from Pali Rohár --- No response here, so I reported it to binutils bugtracker: https://sourceware.org/bugzilla/show_bug.cgi?id=30922
[Bug target/108853] Add new -mcpu=e500 alias for -mcpu=8540
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108853 --- Comment #5 from Pali Rohár --- Back to the original question. Can gcc add a new option -mcpu=e500 as alias to -mcpu=8540 ?
[Bug target/108851] gcc -pie generates unwanted PE export table
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108851 --- Comment #3 from Pali Rohár --- Or do you have any other suggestions?
[Bug target/109317] -Os generates bigger code than -O2 on 32-bit ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109317 --- Comment #2 from Pali Rohár --- Any idea what can be done with this?
[Bug lto/109369] LTO drops explicitly referenced symbol _pei386_runtime_relocator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109369 --- Comment #10 from Pali Rohár --- > I would suggest to move the bug to the Binutils Bugzilla. Done: https://sourceware.org/bugzilla/show_bug.cgi?id=30343
[Bug lto/109369] LTO drops explicitly referenced symbol _pei386_runtime_relocator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109369 --- Comment #8 from Pali Rohár --- So from the discussion, do I understand correctly that this is rather LD linker issue?
[Bug target/108851] gcc -pie generates unwanted PE export table
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108851 --- Comment #2 from Pali Rohár --- So should I report this issue to binutils bugtracker then?
[Bug lto/109369] LTO drops explicitly referenced symbol _pei386_runtime_relocator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109369 --- Comment #4 from Pali Rohár --- I wanted to point that marking _pei386_runtime_relocator() function with __attribute__((used)) is working fine. And whether _pei386_runtime_relocator() should participate in LTO at all? I would rather ask, why not? Is there any specific reason why _pei386_runtime_relocator() should not be compiled with LTO? I would expect from gcc/ld that whole application can be compiled with LTO.
[Bug lto/109368] LTO drops entry point symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109368 --- Comment #4 from Pali Rohár --- Reported to binutils: https://sourceware.org/bugzilla/show_bug.cgi?id=30300
[Bug lto/109368] LTO drops entry point symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109368 --- Comment #2 from Pali Rohár --- I do not know. The issue happens when LTO is enabled for GCC.
[Bug lto/109369] New: LTO drops explicitly referenced symbol _pei386_runtime_relocator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109369 Bug ID: 109369 Summary: LTO drops explicitly referenced symbol _pei386_runtime_relocator Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org CC: marxin at gcc dot gnu.org Target Milestone: --- Target: Mingw32 When PE runtime-pseudo-reloc is used (e.g. referencing member of global array from DLL library without being marked as dllimport), LTO drops _pei386_runtime_relocator symbol even when it is explicitly referenced from used symbol and then it complains that _pei386_runtime_relocator symbol was dropped. This is a bug because LTO compiler 1) should not drop any symbol which is explicitly referenced from some used symbol and 2) should not drop special _pei386_runtime_relocator symbol when it detected that PE runtime-pseudo-reloc is used. Test case: Create simple DLL library with global array arr[]: $ cat arr.c int arr[2] = { 1, 2 }; $ i686-w64-mingw32-gcc -shared arr.c -o arr.dll Define simple startup file for mingw (so to compile full test case without mingw). Function _pei386_runtime_relocator() is explicitly referenced from the startup function mainCRTStartup(): $ cat startup.c extern void _pei386_runtime_relocator(void); extern int main(); int __main() { } __attribute__((force_align_arg_pointer)) __attribute__((noinline)) static int _mainCRTStartup(void) { _pei386_runtime_relocator(); return main(); } __attribute__((used)) /* required due to bug 109368 */ int mainCRTStartup(void) { return _mainCRTStartup(); } Implement PE runtime-pseudo-reloc. For compile-only purposes (without runtime tests) it can be empty: $ cat pseudo-reloc.c void _pei386_runtime_relocator(void) { } And finally simple test program which uses global array from DLL library which is not explicitly marked with dllimport. $ cat main.c extern int arr[]; int main() { return arr[1]; } Without LTO this example compiles fine: $ i686-w64-mingw32-gcc -Os -nostartfiles -nodefaultlibs -nostdlib startup.c pseudo-reloc.c main.c arr.dll -o test.exe With LTO enabled this example does not compile due to dropping explicitly referenced symbol: $ i686-w64-mingw32-gcc -Os -nostartfiles -nodefaultlibs -nostdlib startup.c pseudo-reloc.c main.c arr.dll -o test.exe -flto `__pei386_runtime_relocator' referenced in section `.rdata' of test_exe_ertr04.o: defined in discarded section `.text' of /tmp/ccDpfRvt.o (symbol from plugin) collect2: error: ld returned 1 exit status
[Bug lto/109368] New: LTO drops entry point symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109368 Bug ID: 109368 Summary: LTO drops entry point symbol Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org CC: marxin at gcc dot gnu.org Target Milestone: --- Target: Mingw32 LTO for PE executables drops entry point symbol when the default entry point is used. There is no warning and just PE AddressOfEntryPoint is zeroed. Which results in broken PE binary. When non-default entry point is used and specified via -e option then LTO does not drop entry point symbol and generates working PE executable. Simple test case which does not use any system library or startup file: $ cat test-nostartfiles.c int mainCRTStartup(void) { return 0; } Default console binary has entry point mainCRTStartup() function (as hardcoded in LD sources). $ i686-w64-mingw32-gcc -Wall -Wextra -nostartfiles -nodefaultlibs -nostdlib test-nostartfiles.c -o test-nostartfiles.exe Without LTO it generates working PE binary which correctly returns 0 to system. It also has correct AddressOfEntryPoint field in PE: $ i686-w64-mingw32-objdump -p test-nostartfiles.exe | grep AddressOfEntryPoint AddressOfEntryPoint 1000 When compiling with LTO it does not throw any warning but generates broken PE binary: $ i686-w64-mingw32-gcc -Wall -Wextra -nostartfiles -nodefaultlibs -nostdlib test-nostartfiles.c -o test-nostartfiles.exe -flto Trying to run it, it crashes and has zeroed AddressOfEntryPoint: $ i686-w64-mingw32-objdump -p test-nostartfiles.exe | grep AddressOfEntryPoint AddressOfEntryPoint When non-default entry point is used (specified via -e option) then LTO works correctly and does not drop its entry point. $ cat test-nostartfiles2.c int my_entry(void) { return 0; } $ i686-w64-mingw32-gcc -Wall -Wextra -nostartfiles -nodefaultlibs -nostdlib -e _my_entry test-nostartfiles2.c -o test-nostartfiles2.exe -flto $ i686-w64-mingw32-objdump -p test-nostartfiles2.exe | grep AddressOfEntryPoint AddressOfEntryPoint 1000 Compiled binary works fine. So there is a bug in LTO compiler that it drops entry point if default one is used (i.e. when entry point is not specified via -e option).
[Bug target/109317] New: -Os generates bigger code than -O2 on 32-bit ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109317 Bug ID: 109317 Summary: -Os generates bigger code than -O2 on 32-bit ARM Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- Target: arm-linux-gnueabi Simple loops like the one in the example below are better optimized for size on 32-bit ARM with -O2 option than -Os option. $ cat test-arm.c char *test1(char *ptr) { while (*ptr != '\0' || *(ptr+1) != '\0') ptr++; ptr++; return ptr; } $ arm-linux-gnueabi-gcc -O2 -c test-arm.c && arm-linux-gnueabi-objdump -d test-arm.o test-arm.o: file format elf32-littlearm Disassembly of section .text: : 0: e5d02000ldrbr2, [r0] 4: e281add r0, r0, #1 8: e352cmp r2, #0 c: 1afbbne 0 10: e5d02000ldrbr2, [r0] 14: e352cmp r2, #0 18: 1af8bne 0 1c: e12fff1ebx lr $ arm-linux-gnueabi-gcc -Os -c test-arm.c && arm-linux-gnueabi-objdump -d test-arm.o test-arm.o: file format elf32-littlearm Disassembly of section .text: : 0: e1a03000mov r3, r0 4: e5d32000ldrbr2, [r3] 8: e281add r0, r0, #1 c: e352cmp r2, #0 10: 1afabne 0 14: e5d02000ldrbr2, [r0] 18: e352cmp r2, #0 1c: 1af7bne 0 20: e12fff1ebx lr $ arm-linux-gnueabi-gcc -v Using built-in specs. COLLECT_GCC=arm-linux-gnueabi-gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc-cross/arm-linux-gnueabi/12/lto-wrapper Target: arm-linux-gnueabi Configured with: ../src/configure -v --with-pkgversion='Debian 12.2.0-14' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-12 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --without-target-system-zlib --enable-multiarch --disable-sjlj-exceptions --with-arch=armv5te --with-float=soft --disable-werror --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-gnueabi --program-prefix=arm-linux-gnueabi- --includedir=/usr/arm-linux-gnueabi/include Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (Debian 12.2.0-14)
[Bug target/108853] Add new -mcpu=e500 alias for -mcpu=8540
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108853 --- Comment #3 from Pali Rohár --- I'm still using processors with e500 cores with recent Linux kernel versions and I know also other people who also still using them. Note that NXP still supports some QorIQ processors which have integrated e500 cores. So it is not truth that they are no longer supported by FreeScale/NXP. I know that e500 support was mostly removed out of GCC, but something is still there. And due to this removal, LLVM and clang recently gained some usable e500v2 implementation. I was told that it was heavily tested on FreeBSD with desktop applications. Also musl libc in last year got e500 support. So, no, e500 cpu core is not dead and people still care about it.
[Bug c/108866] New: Allow to pass Windows resource file (.rc) as input to gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 Bug ID: 108866 Summary: Allow to pass Windows resource file (.rc) as input to gcc Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- Currently it is possible to pass source C file, source assembler file, object file, library to gcc as an input argument and gcc calls needed tools to compile and link all input files to one output binary. But gcc currently is not able to recognize Windows resource text file .rc when it is passed as input argument. See: $ x86_64-w64-mingw32-gcc test-rsrc.rc /usr/bin/x86_64-w64-mingw32-ld:test-rsrc.rc: file format not recognized; treating as linker script /usr/bin/x86_64-w64-mingw32-ld:test-rsrc.rc:1: syntax error collect2: error: ld returned 1 exit status Currently resource file first needs to be passed to windres compiler and then output object file from windres can be specified as input argument to gcc: $ x86_64-w64-mingw32-windres --input-format=rc --output-format=coff --input=test-rsrc.rc --output=test-rsrc.o It would be nice if gcc is able to call windres automatically for resource text file, like for assembler source, for generating object file.
[Bug c/108849] New: __declspec(code_seg("segname")) does not work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108849 Bug ID: 108849 Summary: __declspec(code_seg("segname")) does not work Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- Originally reported on: https://sourceware.org/bugzilla/show_bug.cgi?id=30005 GCC/LD does not support __declspec(code_seg("segname")) declarator for specifying name of PE/COFF segment name. Instead GCC/LD supports different and custom syntax __declspec(section("segname")) incompatible with other compilers, like MSVC. Please add support for de-facto standard "code_seg" declarator into the PE/COFF __declspec keyword and not custom declarator. It does not bring any value, just make code incompatible with gcc. Test case on Debian sid: $ x86_64-w64-mingw32-ld -v GNU ld (GNU Binutils) 2.39.90.20230110 $ $ x86_64-w64-mingw32-gcc -v Using built-in specs. COLLECT_GCC=x86_64-w64-mingw32-gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-w64-mingw32/12-win32/lto-wrapper Target: x86_64-w64-mingw32 Configured with: ../../src/configure --build=x86_64-linux-gnu --prefix=/usr --includedir='/usr/include' --mandir='/usr/share/man' --infodir='/usr/share/info' --sysconfdir=/etc --localstatedir=/var --disable-option-checking --disable-silent-rules --libdir='/usr/lib/x86_64-linux-gnu' --libexecdir='/usr/lib/x86_64-linux-gnu' --disable-maintainer-mode --disable-dependency-tracking --prefix=/usr --enable-shared --enable-static --disable-multilib --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --libdir=/usr/lib --enable-libstdcxx-time=yes --with-tune=generic --with-headers --enable-version-specific-runtime-libs --enable-fully-dynamic-string --enable-libgomp --enable-languages=c,c++,fortran,objc,obj-c++,ada --enable-lto --enable-threads=win32 --program-suffix=-win32 --program-prefix=x86_64-w64-mingw32- --target=x86_64-w64-mingw32 --with-as=/usr/bin/x86_64-w64-mingw32-as --with-ld=/usr/bin/x86_64-w64-mingw32-ld --enable-libatomic --enable-libstdcxx-filesystem-ts=yes --enable-dependency-tracking SED=/bin/sed Thread model: win32 Supported LTO compression algorithms: zlib gcc version 12-win32 (GCC) $ $ cat test-code-seg.c __declspec(code_seg("segname")) int test(void) { return 0; } $ $ x86_64-w64-mingw32-gcc -c test-code-seg.c -o test-code-seg.o test-code-seg.c:2:1: warning: 'code_seg' attribute directive ignored [-Wattributes] 2 | int test(void) { return 0; } | ^~~ $ $ x86_64-w64-mingw32-objdump -h test-code-seg.o | grep segname $
[Bug target/108853] New: Add new -mcpu=e500 alias for -mcpu=8540
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108853 Bug ID: 108853 Summary: Add new -mcpu=e500 alias for -mcpu=8540 Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- Target: powerpc* To compile code for powerpc e500 core it is needed to specify -mcpu=8540 option. Name 8540 refers to the SoC MPC8540, which was the first released HW product with powerpc e500 core. All other powerpc gcc's -mcpu options specify core names and not the SoC/product name. So for consistent naming I would propose to add a new option -mcpu=e500 as an alias to -mcpu=8540. Note that other projects like binutils/as and LLVM use "e500" name for specifying e500 core, and not 8540 word like gcc.
[Bug c/108852] New: Add gcc option for building NT kernel driver
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108852 Bug ID: 108852 Summary: Add gcc option for building NT kernel driver Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- gcc already contains options for building different types of PE binaries: -mconsole for console executable; -mwindows for GUI executable; -mdll for DLL library; ... What is missing is option for building NT kernel driver. MSVC link.exe has for this /DRIVER option. It would be nice to have such option also in gcc, which sets all options required for building NT kernel driver. Like not linking startup files, setting image base address and aligning, setting correct entry point or setting PE Native subsystem.
[Bug c/108851] New: gcc -pie generates unwanted PE export table
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108851 Bug ID: 108851 Summary: gcc -pie generates unwanted PE export table Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pali at kernel dot org Target Milestone: --- When gcc is invoked with -pie option then for PE executables it automatically generates export table, even when executable does not export anything. Test case: $ cat test-pie.c int func(void) { return 42; } int main() { return func(); } $ x86_64-w64-mingw32-gcc -pie test-pie.c -o test-pie.exe $ x86_64-w64-mingw32-objdump -p test-pie.exe | grep -A 20 'There is an export table' There is an export table in .edata at 0x140008000 The Export Tables (interpreted .edata section contents) Export Flags0 Time/Date stamp 63f2a29f Major/Minor 0/0 Name8028 test-pie.exe Ordinal Base1 Number in: Export Address Table [Name Pointer/Ordinal] Table Table Addresses Export Address Table8028 Name Pointer Table 8028 Ordinal Table 8028 Export Address Table -- Ordinal Base 1 [Ordinal/Name Pointer] Table Without gcc's -pie option, executable does not have export table. Note that similar issue was reported also to LD https://sourceware.org/bugzilla/show_bug.cgi?id=30004 and proposed LD patch does not change behavior in this issue.