[Bug libstdc++/100128] New: Behavior and performance depends on order of ctype.h and stdlib.h include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100128 Bug ID: 100128 Summary: Behavior and performance depends on order of ctype.h and stdlib.h include Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: travis.downs at gmail dot com Target Milestone: --- When ctype.h is included as the first header in a file, it will be processed without __NO_CTYPE being defined. This results in several differences versus the case where __NO_CTYPE is defined. For example, toupper() is defined as extern inline or as a macro if __NO_CTYPE is undefed, but is not defined (only declared), otherwise. As another example, is_alnum_l and many similar methods will be defined as macros if __NO_CTYPE is undefined, but otherwise will not. On the other hand, if you include stdlib.h (or many other files such as ) in a C++ compile, the C++ "override" file include/c++/10.3.0/stdlib.h gets included, which ultimately ends up including x86_64-linux-gnu/bits/os_defines.h which defines __NO_CTYPE. If is subsequently included, its effect is different as described above. I suppose this is an ODR violation in one way or another (e.g., if two files are included in the same program with and without __NO_CTYPE), and it can also have a significant impact on performance as described here: https://travisdowns.github.io/blog/2019/11/19/toupper.html Evidently, the behavior and definitions exposed by these headers should not depend on the order of include. I suspect there are other cases besides the __NO_CTYPE as long as files that don't trigger the C++ header include chain like ctype.h exist. You can play with this example on godbolt: https://godbolt.org/z/vY4EnE51z Try swapping the order of ctype and stdlib includes to see the effect. The int variables are canaries so you can see which macros were defined in the preprocessed output. This is the same as glibc bug #25214, but I was advised over there than this should be filed against libstdc++ instead. https://sourceware.org/bugzilla/show_bug.cgi?id=25214
[Bug c++/95400] New: -march=native and -march=icelake-client produce different results on icelake client
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95400 Bug ID: 95400 Summary: -march=native and -march=icelake-client produce different results on icelake client Product: gcc Version: 9.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: travis.downs at gmail dot com Target Milestone: --- On an Ice Lake client machine, using -O3 -march=native produces 512-bit AVX-512 instructions, whereas -O3 -march=icelake-client produces 256-bit instructions. Since this machine *is* Ice Lake client, I would expect both options to do the same thing.
[Bug c++/92440] New: Error output for first error truncated with -fmax-errors=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92440 Bug ID: 92440 Summary: Error output for first error truncated with -fmax-errors=1 Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: travis.downs at gmail dot com Target Milestone: --- Consider the following code snippet: template struct S { template friend struct S; }; S<0> s; Compiled with gcc trunk 10.0.0 and any earlier version I tried, it produces with following error without any explicit command line flags: 1 x86-64 gcc (trunk) - cached #2 with x86-64 gcc (trunk) : In instantiation of 'struct S<0>': :7:6: required from here :1:15: error: template parameter 'int T' 1 | template | ^ :4:19: error: redeclared here as 'class U' 4 | friend struct S; | ^ Compiler returned: 1 That's logically a single error. With -fmax-errors=1 the error is truncated in the middle of a "sentence" with only the first part visible, which prevents understanding the error: : In instantiation of 'struct S<0>': :7:6: required from here :1:15: error: template parameter 'int T' 1 | template | ^ compilation terminated due to -fmax-errors=1. Compiler returned: 1 Godbolt link: https://godbolt.org/z/bX2z4f
[Bug target/62011] False Data Dependency in popcnt instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011 --- Comment #16 from Travis Downs --- Also, this is fixed for Skylake for tzcnt and lzcnt but not popcnt.
[Bug target/62011] False Data Dependency in popcnt instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011 Travis Downs changed: What|Removed |Added CC||travis.downs at gmail dot com --- Comment #15 from Travis Downs --- For what it's worth and because Richard asked for it above, there is are Intel erratum for this, at least as of Haswell, for example HSD146: "POPCNT Instruction May Take Longer to Execute Than Expected". It mentions only popcnt, and I found it for Haswell, Skylake (SKL029) and Broadwell. The text is: POPCNT Instruction May Take Longer to Execute Than Expected Problem: POPCNT instruction execution with a 32 or 64 bit operand may be delayed until previous non-dependent instructions have executed. Implication: Software using the POPCNT instruction may experience lower performance than expected. Workaround: None identified