[Bug target/38185] -fstrict-aliasing causes wrong register usage
--- Comment #2 from ddenisen at altera dot com 2008-11-20 14:57 --- I searched through all the options in -O2 that are not in -O1 and found that only one triggers the problem: -fstrict-aliasing. To summarize, g++ -m32 -O1 a.ii does not cause the problem but g++ -m32 -O1 -fstrict-aliasing a.ii does. Changing the title of the bug to reflect this information. -- ddenisen at altera dot com changed: What|Removed |Added Summary|Wrong register used to get |-fstrict-aliasing causes |struct information |wrong register usage http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38185
[Bug target/38185] -fstrict-aliasing causes wrong register usage
--- Comment #3 from ddenisen at altera dot com 2008-11-20 15:10 --- This could be a duplicate of 35643. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38185
[Bug c++/38185] New: Wrong register used to get struct information
Setup: gcc version: 4.2.3 system: Linux RedHat 4, x86-64 CPU, kernel 2.6.9-67.ELsmp To reproduce: Compile attached program as follows: g++ -m32 -O2 -g a.ii and run a.out. You'll see assert failure on line 11669 of a.ii. If you use -O1 instead of -O2, the program passes. I verified that the correct behaviour for the program is to pass. When I traced the execution in the debugger, I found that a wrong register (edx instead of eax) is used to access a variable. It seems GCC has lost track what edx register refers to when it calls get_r_or_f() for the second time. On a high level, the program iterates over a two-element linked list. It looks at the first element, decides to increment the iterator, then looks at the second element. In the correct behaviour, the iterator (B::list_iter) is left pointing to the second element. In buggy behaviour, the iterator is incremented twice, left pointing to the end iterator. In buggy version, while looking at the content of the second list element, the program uses wrong register to access one of the values (bsi.m_type on line 11,627). Instead of getting second list element's m_type, it gets it from the first! This bug reproduces exactly on 4.2.1. It does not reproduce on 4.3.0. However, the assembly code on 4.3.0 looks very different from 4.2.3. This bug shows up and disappears with tiniest unrelated changes to the code. E.g. printing if list_iter == end_iter at the beginning of is_legal_position() would make the bug go away. Please let me know if there is a work-around for the issue with gcc 4.2.3. I'm currently using -O1 and don't mind using it if I'm sure that the bug is caused by a -O2-specific optimization (and not simply re-jigs the code). -- Summary: Wrong register used to get struct information Product: gcc Version: 4.2.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ddenisen at altera dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38185
[Bug c++/38185] Wrong register used to get struct information
--- Comment #1 from ddenisen at altera dot com 2008-11-19 23:57 --- Created an attachment (id=16726) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16726action=view) Compile with g++ -m32 -O2 a.ii to reproduce the crash The source code that shows the problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38185
[Bug c++/35499] Symbol resolution optimization should be separately controlled from -fPIC
--- Comment #4 from ddenisen at altera dot com 2008-06-27 20:34 --- Here's the answer to my question (in case somebody else has the same problem): You have to use -fPIC for compiling the executable itself (but not the shared objects) to fix the symbol resolution problem described here. I was originally omitting -fPIC for both SOs and EXE. and here's some background info about -fPIC removal: - Not using -fPIC only works on x86 architecture (but not even on x86_64) because it allows code pages to be both writable and executable. Not using -fPIC works because the linker creates text relocations (http://people.redhat.com/drepper/textrelocs.html) all over code that dynamic loader fixes up when SO/EXE is loaded. - Having code pages writable and executable is a huge security thread, so don't create text relocations if you care about security of your app. - x86_64 arch allows RIP-related addressing, making position-independent code virtually free. So don't worry about using PIC for 64-bit apps. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35499
[Bug c++/35500] Documentation for -fPIC/-fpic/-fpie is not clear
--- Comment #6 from ddenisen at altera dot com 2008-03-10 14:14 --- Thank you everybody for the feedback. I'm setting the bug to fixed. -- ddenisen at altera dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35500
[Bug c++/35499] New: Symbol resolution optimization should be separately controlled from -fPIC
I'm working on a large non-UI C++ program on Linux. It consists of hundreds of SOs and takes a long time to run (hours). I'm seeing about 10% speedup if I do *not* use -fPIC for some SOs (the executable never uses -fPIC of -fPIE) and, since run-time is very important to me, I want to not use -fPIC for these SOs. I noticed that besides not generating position-independent code, the way global symbols are resolved also changes (a symbol that used to be resolved to a SO where it is defined with -fPIC is resolved to the executable when I omit -fPIC for that SO). That caused a breakage which is not hard to fix once I know which symbol is affected. The problem is that the program has thousands of other symbols and I don't know, without exhaustive testing, if any other symbol will get broken. It would be really nice if symbol resolution and position-independence were controlled by separate flags. -- Summary: Symbol resolution optimization should be separately controlled from -fPIC Product: gcc Version: 4.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ddenisen at altera dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35499
[Bug c++/35500] New: Documentation for -fPIC/-fpic/-fpie is not clear
Documentation for -fpic and family is not clear. Here are the points I would like clarified: - Does pic required for SOs (No)? - What are the upsides/downsides of using pic (faster SO loading vs run-time hit, others?)? - Why would anybody want to create a PIE? -- Summary: Documentation for -fPIC/-fpic/-fpie is not clear Product: gcc Version: 4.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ddenisen at altera dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35500
[Bug c++/35499] Symbol resolution optimization should be separately controlled from -fPIC
--- Comment #3 from ddenisen at altera dot com 2008-03-07 19:43 --- I did read How to write Shared Libraries and re-read PIC section a couple of times. Could you please clarify what am I missing here? If symbol resolution is already controlled separately, what's the flag? I couldn't find anything in the documentation. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35499
[Bug c++/35500] Documentation for -fPIC/-fpic/-fpie is not clear
--- Comment #2 from ddenisen at altera dot com 2008-03-07 19:45 --- But DSOs still work if I don't use PIC. Why is that? Why would anybody want to create position-independent executable? What are the advantages and disadvantages? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35500
[Bug c++/35500] Documentation for -fPIC/-fpic/-fpie is not clear
--- Comment #4 from ddenisen at altera dot com 2008-03-07 20:11 --- I am still learning about linking and loading and I can't guess why non-PIC DSOs would work on x86 but not on x86_64. Could you please explain briefly. This is all very useful information that I couldn't find anywhere else (I guess I could always look at gcc code :) ). Can the following be added to documentation? (for -fpic): PIC is required for DSOs on x86_64 platform but not for x86. (for -fpie): One example of using -PIE is security: text section of position-independent executable can be located at different addresses for each invocation. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35500
[Bug other/35042] New: Documentation for -finline-limit is incorrect
Documentation states that the default value of -finline-limit is 600. However, if -finline-limit=600 is actually used with -O3, the code size is much bigger than if it weren't used with -O3. The real default value seems to be closer to 180. Default values specified for max-inline-insns-single (450) and max-inline-insns-auto(90) are not consistent with being -finline-limit / 2. This is important to fix for people who want to play around with -finline-limit value -- it's good to know what your base is before you change it. -- Summary: Documentation for -finline-limit is incorrect Product: gcc Version: 4.2.2 Status: UNCONFIRMED Severity: trivial Priority: P3 Component: other AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ddenisen at altera dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35042
[Bug other/35042] Documentation for -finline-limit is incorrect
--- Comment #5 from ddenisen at altera dot com 2008-01-31 16:13 --- @emph{Note:} there may be no value to @option{-finline-limit} that results in default behavior. That's also not user-friendly. When it is changed, it is not clear what is more aggressive inlining and what is not. Why don't you make max-inline-insns-single = 5*n/2 and max-inline-insns-auto = n/2? This way, we have a default that is consistent with current settings and setting -finline-limit=180 gives you exactly the default. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35042
[Bug other/35042] Documentation for -finline-limit is incorrect
--- Comment #2 from ddenisen at altera dot com 2008-01-31 15:59 --- (In reply to comment #1) -finline-limit=N should be deprecated. It is an alias for --param max-inline-insns-single=N/2 --param max-inline-insns-auto=N/2. There is no real default, instead the defaults for max-inline-insns-single is 450, the one for max-inline-insns-auto is 90. Having a single knob to control inlining is more user-friendly. If possible, consider keeping it (GCC already has way too many options and parameters to control). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35042
[Bug other/35042] Documentation for -finline-limit is incorrect
--- Comment #4 from ddenisen at altera dot com 2008-01-31 16:12 --- @emph{Note:} there may be no value to @option{-finline-limit} that results in default behavior. That's also not user-friendly. When it is changed, it is not clear what is more aggressive inlining and what is not. Why don't you make max-inline-insns-single = 5*n/2 and max-inline-insns-auto = n/2? This way, we have a default that is consistent with current settings and setting -finline-limit=180 gives you exactly the default. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35042
[Bug other/35042] Documentation for -finline-limit is incorrect
--- Comment #7 from ddenisen at altera dot com 2008-01-31 16:40 --- If the default behaviour has to stay, then I think the option should be removed. Having no option is better than having an option with an unreproducible default. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35042