Re: [AVR] remove two maintainers
On 03/03/14 11:35, David Brown wrote: On 02/03/14 19:24, Denis Chertykov wrote: I would remove two maintainers for AVR port: 1. Anatoly Sokolov ae...@post.ru 2. Eric Weddington eric.wedding...@atmel.com I have discussed the removal with Anatoly Sokolov and he is agree with it. I can't discuss the removal with Eric Weddington because his mail address invalid. Must somebody approve the removal ? (Or I can just apply it) Denis. Eric Weddington has left Atmel, so his address will no longer be valid. I don't know if he still has time to work with AVRs, or if he would still be able to be a maintainer for the AVR port. But I am pretty sure that his new job will not involve AVR's significantly, so it would only be as a hobby (or at best, as a normal avr gcc user). Atmel includes gcc in their development tool (AVR Studio), as well as providing pre-built packages (for Windows and Linux) with the avr-libc library and related tools, using snapshots of mainline gcc with a few patches (for things like support of newer devices). So it seems reasonable to expect that they will be interested in the development and maintenance of the avr port of gcc even though Eric has now left them. If you would like, I can try to contact Atmel and ask if they have someone who would like to take Eric's seat as a port maintainer (or you could do so yourself from Atmel's website). Hi David, Joern Rennecke is working with Atmel on the AVR tool chain. You'll see he has submitted quite a large number of AVR related patches in the last year. I've discussed with Atmel's teams in Trondheim and Chennai and they are very supportive of Joern Rennecke being added as a maintainer. Best wishes, Jeremy -- Tel: +44 (1590) 610184 Cell: +44 (7970) 676050 SkypeID: jeremybennett Twitter: @jeremypbennett Email: jeremy.benn...@embecosm.com Web: www.embecosm.com
Re: linux says it is a bug
On Wed, Mar 05, 2014 at 10:39:51AM +0400, Yury Gribov wrote: What is volatile instructions? Can you give us an example? Check volatile_insn_p. AFAIK there are two classes of volatile instructions: * volatile asm * unspec volatiles (target-specific instructions for e.g. protecting function prologues) -Y Thanks.
dom requires PROP_loops
Hello, In an attempt to test some optimization I destroyed the loop property in pass_tree_loop_done and reinstated it in pass_rtl_loop_init, however then I noticed that pass_dominator started generating wrong code. My guess is that we should mark pass_dominator with PROP_loops as a required property? Do you agree? Cheers, Paulo Matos
Re: [AVR] remove two maintainers
On 10/03/14 11:29, Jeremy Bennett wrote: On 03/03/14 11:35, David Brown wrote: On 02/03/14 19:24, Denis Chertykov wrote: I would remove two maintainers for AVR port: 1. Anatoly Sokolov ae...@post.ru 2. Eric Weddington eric.wedding...@atmel.com I have discussed the removal with Anatoly Sokolov and he is agree with it. I can't discuss the removal with Eric Weddington because his mail address invalid. Must somebody approve the removal ? (Or I can just apply it) Denis. Eric Weddington has left Atmel, so his address will no longer be valid. I don't know if he still has time to work with AVRs, or if he would still be able to be a maintainer for the AVR port. But I am pretty sure that his new job will not involve AVR's significantly, so it would only be as a hobby (or at best, as a normal avr gcc user). Atmel includes gcc in their development tool (AVR Studio), as well as providing pre-built packages (for Windows and Linux) with the avr-libc library and related tools, using snapshots of mainline gcc with a few patches (for things like support of newer devices). So it seems reasonable to expect that they will be interested in the development and maintenance of the avr port of gcc even though Eric has now left them. If you would like, I can try to contact Atmel and ask if they have someone who would like to take Eric's seat as a port maintainer (or you could do so yourself from Atmel's website). Hi David, Joern Rennecke is working with Atmel on the AVR tool chain. You'll see he has submitted quite a large number of AVR related patches in the last year. I've discussed with Atmel's teams in Trondheim and Chennai and they are very supportive of Joern Rennecke being added as a maintainer. Best wishes, Jeremy Hi, It is up to Denis and Joern (or Jørn, as I expect he spells it) to agree on maintainer status - I was just trying to provide a little helpful information. I haven't seen Jørn posting to the avr-gcc-l...@nongnu.org, where many avr-gcc users and developers hang out, but I am very happy to see that Atmel is supporting him in his work on gcc. mvh., David
status of current_pass (notably in gates) .... [possible bug in 4.9]
Hello All, I am a bit confused (or unhappy) about the current_pass variable (in GCC 4.9 svn rev.208447); I believe we have some incoherency about it. It is generally (as it used to be in previous versions of GCC) a global pointer to some opt_pass, declared in gcc/tree-pass.h line 590. It is also (and independently), a local integer in function connect_traces file gcc/bb-reorder.c line 1042. I feel that for readability reasons the local current_pass should be renamed current_pass_num in the function connect_traces. But most importantly, I find confusing the way current_pass pointer is globally set (and reset). The obvious policy seems to set current_pass to this before calling any virtual methods on it (notably the gate and the exec functions). However, if one use -fdump-passes program argument to gcc (i.e. to cc1), then dump_passes (from gcc/passes.c line 892) gets called. It then calls function dump_one_pass (from gcc/passes.c line 851) which does line 857 is_on = pass-has_gate ? pass-gate () : true; But in other occasions, notably in function execute_one_pass (starting at gcc/passes.c line 2153) the global current_pass is set (line 2166) before calling its gate function line 2170 gate_status = pass-has_gate ? pass-gate () : true; I believe something should be done about this, since it seems to confuse plugins (like MELT). Either we decide that current_pass is always set before calling any virtual function on it (notably the gate) or we decide that current_pass global should disappear (but then, what about the curr_statistics_hash function from gcc/statistics.c line 93 which uses it line 98)? Comments are welcome. I think we should do something about this before releasing GCC 4.9... The simplest thing would be to set current_pass in dump_one_pass Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
GNU C extension: Function Error vs. Success
Hi, First, let me say that I'm not subscribed to the mailing list, so please CC myself when responding. This post is to discuss a possible extension to the GNU C language. Note that this is still an idea and not refined. Background == In C, the following code structure is ubiquitous: return value = function_call(arguments); if (return_value == ERROR_VALUE) goto exit_fail; You can take a look at goto usages in the Linux kernel just for examples (https://github.com/torvalds/linux/search?q=goto). However, this method has one particular drawback, besides verbosity among others. This drawback is that each function has to designate (at least) one special value as ERROR_VALUE. Trivial as it may seem, this has by itself resulted in many inconsistencies and problems. For example, `malloc` signals failure by returning `NULL`, `strtod` may return 0, `HUGE_VAL*` etc, `fread` returns 0 which is not necessarily an error case either, `fgetc` returns `EOF`, `remove` returns nonzero if failed, `clock` returns -1 and so on. Sometimes such a special value may not even be possible, in which case a workaround is required (put the return value as a pointer argument and return the state of success). The following suggestion allows clearer and shorter error handling. The Extension (Basic) = First, let's introduce a new syntax (note again, this is just an example. I don't suggest these particular symbols): float inverse(int x) { if (x == 0) fail; return 1.0f / x; } ... y = inverse(x) !! goto exit_inverse_failed; The semantics of this syntax would be as follows. The function `inverse` can choose to `fail` instead of `return`, in which case it doesn't actually return anything. From the caller site, this failure is signaled (speculations on details below), `y` is not assigned and a `goto exit_inverse_failed` is executed. The observed behavior would be equivalent to: int inverse(int x, float *y) { if (x == 0) return -1; *y = 1.0f / x; return 0; } ... if (inverse(x, y)) goto exit_inverse_failed; The Extension (Advanced) Sometimes, error handling is done not just by a single `goto` (although they can all be reduced to this). For example: return value = function_call(arguments); if (return_value == ERROR_VALUE) { /* a small piece of code, such as printing an error */ goto exit_fail; } This could be shortened as: return value = function_call(arguments) !! { /* a small piece of code, such as printing an error */ goto exit_fail; } A generic syntax could therefore be used: return value = function_call(arguments) !! goto exit_fail; return value = function_call(arguments) !! fail; return value = function_call(arguments) !! return 0; return value = function_call(arguments) !! { /* more than one statement */ } Another necessity is for the error code. While `errno` is usable, it's not the best solution in the world. Extending the syntax further, the following could be used (again, syntax is just for the sake of example, I'm not suggesting these particular symbols): float inverse(int x) { if (x == 0) fail EDOM; return 1.0f / x; } ... y = inverse(x) !!= error_code !! goto exit_inverse_failed; By this, the function `inverse` can `fail` with an error code (again, speculations of details below), which can be stored in a variable (`error_code`) in call site. Some Details == The state of failure and success as well as the failure code can be kept in registers, to keep the ABI backward-compatible. If backward compatibility is required, a `fail`able function must still provide a fail value (simply to keep older code intact), which could have a syntax as follows (for example): float inverse(int x) !! 0 { if (x == 0) fail EDOM; return 1.0f / x; } ... y = inverse(x); In this example, the caller doesn't check for failure and would receive the fail value indicated by the function signature. If no such fail value is given, the caller must check for failure. This allows older code, such as the standard library to be possibly used in the way it has always been (by providing fail value) or with this extension, while allowing cleaner and more robust code to be written (by not providing fail value). Examples Here are some examples. Opening a file and reading a number (normal C): int n; FILE *fin = fopen(filename, r); if (fin == NULL) goto exit_no_file; if (fscanf(fin, %d, n) != 1) if (ferror(fin)) goto exit_io_error; else { /* complain about format */ } fclose(fin); return 0; exit_io_error: /* print error: I/O error */
Re: GNU C extension: Function Error vs. Success
On Mon, 10 Mar 2014 15:27:06 +0100 Shahbaz Youssefi shab...@gmail.com wrote: Feedback Please let me know what you think. In particular, what would be the limitations of such a syntax? Would you be interested in seeing this extension to the GNU C language? What alternative symbols do you think would better show the intention/simplify parsing/look more beautiful? I suggest you think about how this is better than C++ exceptions, and also consider alternatives like OCaml's option types that can be used to achieve similar ends. For your suggested syntax at function call sites, consider that functions can be called in more complicated ways than simply as bar = foo(); statements, and the part following the !! in your examples appears to be a statement itself: in more complicated expressions, that interleaving of expressions and statements going to get very ugly very quickly. E.g.: x = foo() + bar(); would need to become something like: x = (foo() !! goto label1) + (bar () !! goto label2); And there are all sorts of issues with that. Anyway, I quite like the idea of rationalising error-code returns in C code, but I don't think this is the right way of going about it. HTH, Julian
Re: GNU C extension: Function Error vs. Success
Hi Julian, Thanks for the feedback. Regarding C++ exceptions: exceptions are not really nice. They can just make your function return without you even knowing it (forgetting a `try/catch` or not knowing it may be needed, which is C++'s fault and probably could have been done better). Also, they require complicated operations. You can read a small complaint about it here: http://stackoverflow.com/a/1746368/912144 and I'm sure there are many others on the internet. Regarding OCaml option types: in fact I did think of this separation after learning about monads (through Haskell though). However, personally I don't see how monads could be introduced in C without majorly affecting syntax and ABI. Regarding the syntax: You are absolutely right. I don't claim that this particular syntax is ideal. I'm sure the many minds in this mailing list are able to find a more beautiful syntax, if they are interested in the idea. Nevertheless, the following code: x = foo() + bar(); doesn't do any error checking. I.e. it assumes `foo` and `bar` are unfailable. If that is the case, there is no need for a `!! goto fail_label` at all. I personally have never seen such an expression followed by e.g. if (x ) goto foo_or_bar_failed; On the other hand, something like this is common: while (func(...) == 0) which, if turned to e.g.: while (func(...) !! break) or fail_code=0; while (func(...) !!= fail_code, fail_code == 0) could seem awkward at best. My hope is that through this discussion, we would be able to figure out a way to separate success and failure of functions with minimal change to the language. My syntax is based on having the return value intact while returning the success-failure and error-code in registers both for speed and compatibility and let the compiler generate the repetitive/ugly error-checking code. Other than that, I personally don't have any attachments to the particular way it's embedded in the grammar of GNU C. On Mon, Mar 10, 2014 at 3:50 PM, Julian Brown jul...@codesourcery.com wrote: On Mon, 10 Mar 2014 15:27:06 +0100 Shahbaz Youssefi shab...@gmail.com wrote: Feedback Please let me know what you think. In particular, what would be the limitations of such a syntax? Would you be interested in seeing this extension to the GNU C language? What alternative symbols do you think would better show the intention/simplify parsing/look more beautiful? I suggest you think about how this is better than C++ exceptions, and also consider alternatives like OCaml's option types that can be used to achieve similar ends. For your suggested syntax at function call sites, consider that functions can be called in more complicated ways than simply as bar = foo(); statements, and the part following the !! in your examples appears to be a statement itself: in more complicated expressions, that interleaving of expressions and statements going to get very ugly very quickly. E.g.: x = foo() + bar(); would need to become something like: x = (foo() !! goto label1) + (bar () !! goto label2); And there are all sorts of issues with that. Anyway, I quite like the idea of rationalising error-code returns in C code, but I don't think this is the right way of going about it. HTH, Julian
Re: GNU C extension: Function Error vs. Success
On Mon, Mar 10, 2014 at 03:27:06PM +0100, Shahbaz Youssefi wrote: Hi, First, let me say that I'm not subscribed to the mailing list, so please CC myself when responding. This post is to discuss a possible extension to the GNU C language. Note that this is still an idea and not refined. [] The Extension (Basic) = First, let's introduce a new syntax (note again, this is just an example. I don't suggest these particular symbols): float inverse(int x) { if (x == 0) fail; return 1.0f / x; } ... y = inverse(x) !! goto exit_inverse_failed; Syntax is not that important. To experiment your idea, I would suggest using a mixture of pragmas and builtins; you could perhaps have a new builtin_shahbaz_fail() and a pragma #pragma SHAHBAZ and then your temporary syntax would be float inverse(int x) { if (x == 0) builtin_shahbaz_fail(); return 1.0f / x; } #pragma SHAHBAZ on_error_goto(exit_inverse_failed) { y = inverse(x); } Then, you don't need to dig into GCC parser to add these builtin and pragma. You could add them with a GCC plugin (in C++) or using MELT http://gcc-melt.org/ Once you added a GCC pass to support your builtin and pragma (which is difficult, and means understanding the details of internals of GCC) you could convince other people. Notice that the GCC community is not friendly these days to new syntactic constructs. BTW, once you have implemented a builtin and a pragma you could use preprocessor macros to make these look more like your syntax. I would believe that MELT is very well suited for such experiments. Regards. PS. Plugins cannot extend the C syntax (except thru attributes, builtins, pragmas). -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
[RL78] Questions about code-generation
Hi, The code produced by GCC for the RL78 target is around twice as large as that produced by IAR and I've been trying to find out why. The project I'm working on uses an RL78/F12 with 16KB of code flash. As I have to get a bootloader and an application into that, I have to pay close attention to how large the code is becoming. Looking at the assembler output for some simple examples, the problem seems to be 'bloated' code as opposed to not squeezing every last byte out through the use of ingenious optimization tricks. I've managed to build GCC myself so that I could experiment a bit but as this is my first foray into compiler internals, I'm struggling to work out how things fit together and what affects what. My initial impression is that significant gains could be made by clearing away some low-hanging fruit, but without understanding what caused that code to be generated in the first place, it's hard to do anything about it. In particular, I'd be interested to know what is caused (or could be improved) by the RL78-specific code, and what comes from the generic part of GCC. Here's an example extracted from one of the functions in our project: unsigned short gOrTest; #define SOE0 (*(volatile unsigned short *)0xF012A) void orTest() { SOE0 |= 3; /* gOrTest |= 3; */ } This produces the following code (using -Os): 29 C9 F2 2A 01 movw r10, #298 30 0004 AD F2 movw ax, r10 31 0006 16 movw hl, ax 32 0007 AB movw ax, [hl] 33 0008 BD F4 movw r12, ax 34 000a 60 mov a, x 35 000b 6C 03 or a, #3 36 000d 9D F0 mov r8, a 37 000f 8D F5 mov a, r13 38 0011 9D F1 mov r9, a 39 0013 AD F2 movw ax, r10 40 0015 12 movw bc, ax 41 0016 AD F0 movw ax, r8 42 0018 78 00 00 movw [bc], ax 43 001b D7 ret There's so much unnecessary register passing going on there (#298 could go straight into HL, why does the same value end up in BC even though HL hasn't been touched? etc.) Commenting out the 'SOE0' line and bringing the 'gOrTest' line back in generates better code (but still worthy of optimization): 29 8F 00 00 mov a, !_gOrTest 30 0003 6C 03 or a, #3 31 0005 9F 00 00 mov !_gOrTest, a 32 0008 8F 00 00 mov a, !_gOrTest+1 33 000b 6C 00 or a, #0 34 000d 9F 00 00 mov !_gOrTest+1, a 35 0010 D7 ret What causes that code to be generated when using a variable instead of a fixed memory address? Even allowing for the unnecessary 'or a, #0' and keeping to a 16-bit access, it's still possible to perform the same operation in half the space of the original: 29 36 2A 01 movw hl, #298 30 0003 AB movw ax, [hl] 31 0004 75 mov d, a 32 0005 60 mov a, x 33 0006 6C 03 or a, #3 34 0008 70 mov x, a 35 0009 65 mov a, d 36 000a 6C 00 or a, #0 37 000c BB movw [hl], ax 38 000d D7 ret And, of course, that could be optimized further. Excessive register copying and an apparant preference for R8 onwards over the B,C,D,E,H and L registers (which could save a byte on every 'mov') seems to be one of the main causes of 'bloated' code (among others). So, I guess my question is how much of the bloat comes from inefficiencies in the hardware-specific code? I saw a comment in the RL78 code about performing CSE optimization but it's not clear to me where or how that would be done. I tried to look at the code for some other processors to get an idea but it's hard to find things when you don't know what you're looking for :) Any help would be gratefully received! Regards, Richard Hulme
Re: GNU C extension: Function Error vs. Success
10.03.2014 18:27, Shahbaz Youssefi пишет: FILE *fin = fopen(filename, r) !! goto exit_no_file; Or maybe permission denied? ;-)
Re: GNU C extension: Function Error vs. Success
Thanks for the hint. I would try to learn how to do that and experiment on the idea if/when I get the time. I could imagine why the community isn't interested in new syntax in general. Still, you may never know if an idea would be attractive enough to generate some attention! :) On Mon, Mar 10, 2014 at 4:26 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: On Mon, Mar 10, 2014 at 03:27:06PM +0100, Shahbaz Youssefi wrote: Hi, First, let me say that I'm not subscribed to the mailing list, so please CC myself when responding. This post is to discuss a possible extension to the GNU C language. Note that this is still an idea and not refined. [] The Extension (Basic) = First, let's introduce a new syntax (note again, this is just an example. I don't suggest these particular symbols): float inverse(int x) { if (x == 0) fail; return 1.0f / x; } ... y = inverse(x) !! goto exit_inverse_failed; Syntax is not that important. To experiment your idea, I would suggest using a mixture of pragmas and builtins; you could perhaps have a new builtin_shahbaz_fail() and a pragma #pragma SHAHBAZ and then your temporary syntax would be float inverse(int x) { if (x == 0) builtin_shahbaz_fail(); return 1.0f / x; } #pragma SHAHBAZ on_error_goto(exit_inverse_failed) { y = inverse(x); } Then, you don't need to dig into GCC parser to add these builtin and pragma. You could add them with a GCC plugin (in C++) or using MELT http://gcc-melt.org/ Once you added a GCC pass to support your builtin and pragma (which is difficult, and means understanding the details of internals of GCC) you could convince other people. Notice that the GCC community is not friendly these days to new syntactic constructs. BTW, once you have implemented a builtin and a pragma you could use preprocessor macros to make these look more like your syntax. I would believe that MELT is very well suited for such experiments. Regards. PS. Plugins cannot extend the C syntax (except thru attributes, builtins, pragmas). -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: GNU C extension: Function Error vs. Success
On 03/10/2014 03:09 PM, Shahbaz Youssefi wrote: Regarding C++ exceptions: exceptions are not really nice. They can just make your function return without you even knowing it (forgetting a `try/catch` or not knowing it may be needed, which is C++'s fault and probably could have been done better). Also, they require complicated operations. You can read a small complaint about it here: http://stackoverflow.com/a/1746368/912144 and I'm sure there are many others on the internet. A few quibbles here. Firstly, C++ exceptions do not require complicated operations: an implementation may well do complicated things, but that's not the same at all. In GCC we use DWARF exception handling, which is designed to be near-zero-cost for exceptions that are not thrown, but is more expensive when they are. There is no inherent reason why float inverse(int x) { if (x == 0) fail; return 1.0f / x; } y = inverse(x) !! goto exit_inverse_failed; should not generate the same code as float inverse(int x) { if (x == 0) throw overflow; return 1.0f / x; } try { y = inverse(x); } catch (IntegerOverflow e) { goto exit_inverse_failed; } This assumes, of course, a knowledgeable optimizing compiler. Also, consider that C++ can already do almost what you want. Here we have a function that returns a float wrapped with a status: optionfloat inverse(float x) { if (x == 0) return optionfloat(); // No value... return 1.0f / x; } float poo(float x) { optionfloat res = inverse(x); if (res.none()) return 0; return res; } GCC generates, quite nicely: poo(float): xorps %xmm1, %xmm1 ucomiss %xmm1, %xmm0 jp .L12 jne .L12 movaps %xmm1, %xmm0 ret .L12: movss .LC1(%rip), %xmm1 divss %xmm0, %xmm1 movaps %xmm1, %xmm0 ret The difference between y = inverse(x) !! goto exit_inverse_failed; and optionfloat y = inverse(x); if (y.none()) goto exit_inverse_failed; is, I suggest to you, mere syntax. The latter is more explicit. Andrew.
Re: GNU C extension: Function Error vs. Success
I'm mostly interested in C. Nevertheless, you can of course also do the same in C: struct option_float { float value; int error_code; bool succeeded; }; struct option_float inverse(int x) { if (x == 0) return (struct option_float){ .succeeded = false, .error_code = EDOM }; return (struct option_float){ .value = 1.0f / x, .succeeded = true }; } you get the idea. The difference is that it's hard to optimize the non-error execution path if the compiler is not aware of the semantics. Also, with exceptions, this can happen: float inverse(int x) { if (x == 0) throw overflow; return 1.0f / x; } y = inverse(x); Which means control is taken from the function calling inverse without it explicitly allowing it, which is not in the spirit of C. P.S. programming in a lot of languages is _mere syntax_ with respect to some others. Still, some syntaxes are good and some not. If we can improve GNU C's syntax to be shorter, but without loss of expressiveness or clarity, then why not! On Mon, Mar 10, 2014 at 6:18 PM, Andrew Haley a...@redhat.com wrote: On 03/10/2014 03:09 PM, Shahbaz Youssefi wrote: Regarding C++ exceptions: exceptions are not really nice. They can just make your function return without you even knowing it (forgetting a `try/catch` or not knowing it may be needed, which is C++'s fault and probably could have been done better). Also, they require complicated operations. You can read a small complaint about it here: http://stackoverflow.com/a/1746368/912144 and I'm sure there are many others on the internet. A few quibbles here. Firstly, C++ exceptions do not require complicated operations: an implementation may well do complicated things, but that's not the same at all. In GCC we use DWARF exception handling, which is designed to be near-zero-cost for exceptions that are not thrown, but is more expensive when they are. There is no inherent reason why float inverse(int x) { if (x == 0) fail; return 1.0f / x; } y = inverse(x) !! goto exit_inverse_failed; should not generate the same code as float inverse(int x) { if (x == 0) throw overflow; return 1.0f / x; } try { y = inverse(x); } catch (IntegerOverflow e) { goto exit_inverse_failed; } This assumes, of course, a knowledgeable optimizing compiler. Also, consider that C++ can already do almost what you want. Here we have a function that returns a float wrapped with a status: optionfloat inverse(float x) { if (x == 0) return optionfloat(); // No value... return 1.0f / x; } float poo(float x) { optionfloat res = inverse(x); if (res.none()) return 0; return res; } GCC generates, quite nicely: poo(float): xorps %xmm1, %xmm1 ucomiss %xmm1, %xmm0 jp .L12 jne .L12 movaps %xmm1, %xmm0 ret .L12: movss .LC1(%rip), %xmm1 divss %xmm0, %xmm1 movaps %xmm1, %xmm0 ret The difference between y = inverse(x) !! goto exit_inverse_failed; and optionfloat y = inverse(x); if (y.none()) goto exit_inverse_failed; is, I suggest to you, mere syntax. The latter is more explicit. Andrew.
Re: GNU C extension: Function Error vs. Success
On 03/10/2014 05:26 PM, Shahbaz Youssefi wrote: I'm mostly interested in C. Nevertheless, you can of course also do the same in C: struct option_float { float value; int error_code; bool succeeded; }; struct option_float inverse(int x) { if (x == 0) return (struct option_float){ .succeeded = false, .error_code = EDOM }; return (struct option_float){ .value = 1.0f / x, .succeeded = true }; } Well, yes. This is rather wordy, but indeed it does the same thing. P.S. programming in a lot of languages is _mere syntax_ with respect to some others. Still, some syntaxes are good and some not. If we can improve GNU C's syntax to be shorter, but without loss of expressiveness or clarity, then why not! Because C is a simple language. That's a feature: if you want more language complexity, and C++ can already do what you want, what not use C++? The usual argument is I don't want all this other stuff. Well, don't use it, then! There seem to be many people who what what C++ can do, but say I don't want to use C++. Andrew.
Re: [gsoc 2014] moving fold-const patterns to gimple
On Fri, Mar 7, 2014 at 2:38 PM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Mar 6, 2014 at 7:17 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: On Thu, Mar 6, 2014 at 6:13 PM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Mar 6, 2014 at 1:11 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: On Mon, Mar 3, 2014 at 3:32 PM, Richard Biener richard.guent...@gmail.com wrote: On Sun, Mar 2, 2014 at 9:13 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: Hi, I am an undergraduate student at University of Pune, India, and would like to work on moving folding patterns from fold-const.c to gimple. I've seen the entry on our GSoC project page and edited it to discourage people from working on that line. See http://gcc.gnu.org/ml/gcc/2014-02/msg00516.html for why. I think that open-coding the transforms isn't maintainable in the long run. If I understand correctly, constant folding is done on GENERIC (by routines in fold-const.c), and then GENERIC is lowered to GIMPLE. The purpose of this project, is to have constant folding to be performed on GIMPLE instead (in gimple-fold.c?) I have a few elementary questions to ask: a) A contrived example: Consider a C expression, a = ~0 (assume a is int) In GENERIC, this would roughly be represented as: modify_exprvar_decl a, bit_not_exprinteger_cst 0 this gets folded to: modify_exprvar_decl a, integer_cst -1 and the corresponding gimple tuple generated is (-fdump-tree-gimple-raw): gimple_assign integer_cst, x, -1, NULL, NULL So, instead of folding performed on GENERIC, it should be done on GIMPLE. So a tuple like the following should be generated by gimplification: bit_not_expr, a, 0, NULL, NULL and folded to (by call to fold_stmt): integer_cst, a, -1, NUL, NULL Is this the expected behavior ? I have attached a rough/incomplete patch (only stage1 compiled cc1), that does the following foldings on bit_not_expr: a) ~ INTEGER_CST = folded b) ~~x = x c) ~(-x) = x - 1 (For the moment, I put case BIT_NOT_EXPR: return NULL_TREE in fold_unary_loc to avoid folding in GENERIC on bit_not_expr) Is the patch going in the correct direction ? Or have I completely missed the point here ? I would be grateful to receive suggestions, and start working on a fair patch. I think you implement what was suggested by Kai (and previously by me and Andrew, before I changed my mind). Hi Richard, Thanks for your reply and for pointing me out to this thread http://gcc.gnu.org/ml/gcc/2014-02/msg00516.html I agree it's better to generate patterns from a meta-description instead of hand-coding, and the idea seems interesting to me. I was playing around with the patch and did few trivial modifications (please find the patch attached): a) use obstack in parse_c_expr. b) use @ inside c code, instead of directly writing captures (like $num in bison): example: /* Match and simplify CST + CST to CST'. */ (define_match_and_simplify baz (PLUS_EXPR INTEGER_CST_P@0 INTEGER_CST_P@1) { int_const_binop (PLUS_EXPR, @0, @1); }) c) Not sure if this is a good idea, conditional matching. for example: /* match (A * B) and simplify to * B if integer_zerop B is true ( A * 0 = 0) * A if integer_onep B is true (A * 1 = A) */ (define_match_and_simplify multexpr (MULT_EXPR integral_op_p@0 integral_op_p@1) [ (integer_zerop@1 @1) (integer_onep@1 @0) ]) Maybe condition can be generalized to be any operand instead of testing predicate on capture operand ? I would be grateful to receive some direction for working on this project. From the thread, I see a few possibilities: a) Moving patterns from tree-ssa-forwprop b) Extending the DSL (handle commutative operators, conditionally enabling patterns ?) c) Targeting GENERIC (Generating patterns in fold-const.c from the description ?) d) This is a bit silly, but maybe perform more error checking ? for example the following pattern is currently accepted: (define_match px (PLUS_EXPR @0 @1 @2)) Note that I'm currently still hacking on this (see attachment for what I have right now). The grammar is still in flux but I'd like to keep it simple for now (so no conditional replacement). I have changed quite some bits so d) should be easily possible now and I've done b) from your list as well. For the moment I'm trying to see whether the design is sound, especially the GCC-side APIs. I hope to finish this this week (fingers crossing), and also settle on the syntax (see variants in the .pd). As for opening this up for a GSOC project to finish or work on that's a good idea. In addition to a) Moving patterns from tree-ssa-forwprop which I think is the place where its easiest to plug this in without regressions it would be nice if you could work on e) Generate a state machine for the matching part, instead of trying one pattern after each other (see how insn-recog.c is produced). I hope to cleanup
Re: [RL78] Questions about code-generation
I've managed to build GCC myself so that I could experiment a bit but as this is my first foray into compiler internals, I'm struggling to work out how things fit together and what affects what. The key thing to know about the RL78 backend, is that it has two targets it uses. For the first part of the compilation, up until after reload, the model uses 16 virtual registers (R8 through R15) and a virtual machine to give gcc an orthogonal model that it can generate code for. After reload, there's a devirtualization pass in the RL78 backend that maps the virtual model to the real model (R0 through R7), which means copying values in and out of the real registers according to which addressing modes are needed. Then GCC continues optimizing, which gets rid of most of the unneeded instructions. The problem you're probably running into is that deciding which real registers to use for each virtual one is a very tricky task, and the post-reload optimizers aren't expecing the code to look like what it does. What causes that code to be generated when using a variable instead of a fixed memory address? The use of volatile disables many of GCC's optimizations. I consider this a bug in GCC, but at the moment it needs to be fixed in the backends on a case-by-case basis.
Re: [RL78] Questions about code-generation
On 10/03/14 22:37, DJ Delorie wrote: I've managed to build GCC myself so that I could experiment a bit but as this is my first foray into compiler internals, I'm struggling to work out how things fit together and what affects what. The key thing to know about the RL78 backend, is that it has two targets it uses. For the first part of the compilation, up until after reload, the model uses 16 virtual registers (R8 through R15) and a virtual machine to give gcc an orthogonal model that it can generate code for. After reload, there's a devirtualization pass in the RL78 backend that maps the virtual model to the real model (R0 through R7), which means copying values in and out of the real registers according to which addressing modes are needed. Then GCC continues optimizing, which gets rid of most of the unneeded instructions. The problem you're probably running into is that deciding which real registers to use for each virtual one is a very tricky task, and the post-reload optimizers aren't expecing the code to look like what it does. What causes that code to be generated when using a variable instead of a fixed memory address? The use of volatile disables many of GCC's optimizations. I consider this a bug in GCC, but at the moment it needs to be fixed in the backends on a case-by-case basis. Ah, that certainly explains a lot. How exactly would the fixing be done? Is there an example I could look at for one of the other processors? It's certainly unfortunate, since an awful lot of bit-twiddling goes on with the memory-mapped hardware registers (which obviously generally need to be declared volatile). Just to get a feel for the potential gains, I've removed the volatile keyword from all the declarations and rebuilt the project. That change alone reduces the code size by 3.7%. I wouldn't want to risk running that code but the gain is certainly significant. I calculated a week or two ago that we could make a code-saving of around 8% by using near or relative branches and near calls instead of always generating far calls. I changed rl78-real.md to use near addressing and got about 5%. That's probably about right. I tried to generate relative branches too but I'm guessing that the 'length' attribute needs to be set for all instructions to get that working properly. Obviously near/far addressing would need to be controlled by an external switch to allow for processors with more than 64KB code-flash. A few small gains can be had elsewhere (using 'clrb a' in zero_extendqihi2_real, possibly optimizing addsi3_internal_real to avoid addw ax,#0 etc.). These don't save much space in our project (about 30-40 bytes perhaps) but it'll obviously vary from project to project. Regards, Richard
Re: [RL78] Questions about code-generation
Ah, that certainly explains a lot. How exactly would the fixing be done? Is there an example I could look at for one of the other processors? No, RL78 is the first that uses this scheme. I calculated a week or two ago that we could make a code-saving of around 8% by using near or relative branches and near calls instead of always generating far calls. I changed rl78-real.md to use near addressing and got about 5%. That's probably about right. I tried to generate relative branches too but I'm guessing that the 'length' attribute needs to be set for all instructions to get that working properly. Or the linker could be taught to optimize branches once it knows the full displacement, but that can be even trickier to get right.
Re: [RL78] Questions about code-generation
DJ, On Mon, 2014-03-10 at 20:17 -0400, DJ Delorie wrote: Ah, that certainly explains a lot. How exactly would the fixing be done? Is there an example I could look at for one of the other processors? No, RL78 is the first that uses this scheme. I'm curious. Have you tried out other approaches before you decided to go with the virtual registers? Cheers, Oleg
Re: [RL78] Questions about code-generation
I'm curious. Have you tried out other approaches before you decided to go with the virtual registers? Yes. Getting GCC to understand the unusual addressing modes the RL78 uses was too much for the register allocator to handle. Even when the addressing modes are limited to usual ones, GCC doesn't have a good way to do regalloc and reload when there are limits on what registers you can use in an address expression, and it's worse when there are dependencies between operands, or limited numbers of address registers.
Scheduler:LLVM vs gcc, which is better
Hi, I read LLVM code for a while,and a question raise:Whose scheduler is better? LLVM brings in the DAG,and make it look important just like IR or MachineInst.But is that necessary?I don't see what kind of problem it tries to solve. From the pipeline of the compiler, LLVM can not do sched2.Is that suck? -- Regards lin zuojian.
Re: Scheduler:LLVM vs gcc, which is better
On Mon, Mar 10, 2014 at 6:59 PM, lin zuojian manjian2...@gmail.com wrote: Hi, I read LLVM code for a while,and a question raise:Whose scheduler is better? LLVM brings in the DAG,and make it look important just like IR or MachineInst.But is that necessary?I don't see what kind of problem it tries to solve. From the pipeline of the compiler, LLVM can not do sched2.Is that suck? I clearly can't speak for GCC developers, but as an LLVM developer I have to say, this seems like a (somewhat rudely phrased) question for the LLVM mailing lists where there are people more familiar with the LLVM internals. Happy to reply in more depth there (or here if folks are actually interested).
Re: Scheduler:LLVM vs gcc, which is better
On Mon, Mar 10, 2014 at 07:11:43PM -0700, Chandler Carruth wrote: On Mon, Mar 10, 2014 at 6:59 PM, lin zuojian manjian2...@gmail.com wrote: Hi, I read LLVM code for a while,and a question raise:Whose scheduler is better? LLVM brings in the DAG,and make it look important just like IR or MachineInst.But is that necessary?I don't see what kind of problem it tries to solve. From the pipeline of the compiler, LLVM can not do sched2.Is that suck? I clearly can't speak for GCC developers, but as an LLVM developer I have to say, this seems like a (somewhat rudely phrased) question for the LLVM mailing lists where there are people more familiar with the LLVM internals. Happy to reply in more depth there (or here if folks are actually interested). Hi, I just ask for opinions.I think many GCC developers do familiar with the opponent.If I ask in the LLVM mailing list, I have to worry about If they are familiar with GCC, too(what's sched2 pass?). -- Regards lin zuojian
Re: Scheduler:LLVM vs gcc, which is better
On Mon, Mar 10, 2014 at 7:33 PM, lin zuojian manjian2...@gmail.com wrote: Hi, I just ask for opinions.I think many GCC developers do familiar with the opponent.If I ask in the LLVM mailing list, I have to worry about If they are familiar with GCC, too(what's sched2 pass?). I suspect you will have the same problem on both lists. The internal details of the scheduling system are not likely to be widely known honestly. To provide a very brief summary of what is going on in LLVM to the best of my knowledge (although I am not one of the experts on this area): The DAG (or more fully, the SelectionDAG) is not really relevant to scheduling any more[1]. It is just a mechanism used for legalization and combining the target-specific representation prior to producing very low level MI or Machine Instructions. That is, it is entirely an instruction selection tool, not a a scheduling tool. The scheduling takes place amongst the machine instructions either before or after register allocation (depending on the target) and with a wide variety of heuristics. My understanding is that it is attempting to solve the same fundamental scheduling problems as GCC's infrastructure (ILP and register pressure). The infrastructure is the bulk of it though, and that is likely entirely specific to the representation and innards of the respective compilers. Given that LLVM's machine level scheduler is significantly younger than GCC's, I would expect it to be less well tuned and have less complete and/or accurate modeling for some targets. -Chandler [1] Historically, we did both instruction selection and scheduling using the DAG, but the scheduling has moved into the MI layer specifically to address register pressure and ILP concerns that were hard/impossible to handle at a higher level. There is some hope that the DAG goes away completely and is replaced with some simpler selection and legalization framework, but that hasn't yet emerged.
Re: Scheduler:LLVM vs gcc, which is better
Hi Chandler, Thanks a lot for your answer.It is pretty misleading to find out that DAG has schedule unit. -- Regards lin zuojian
[Bug middle-end/60478] convert_move assert failed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60478 Marek Polacek mpolacek at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||mpolacek at gcc dot gnu.org Resolution|--- |DUPLICATE --- Comment #1 from Marek Polacek mpolacek at gcc dot gnu.org --- You've filed the same bug twice. *** This bug has been marked as a duplicate of bug 60479 ***
[Bug middle-end/60479] convert_move assert failed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60479 --- Comment #1 from Marek Polacek mpolacek at gcc dot gnu.org --- *** Bug 60478 has been marked as a duplicate of this bug. ***
FW: GCC global variable register optimization issue
gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux. Consider the following code: - include stdint.h register uint64_t i0_BP __asm__ (r14); register uint64_t i0_SP __asm__ (r15); void test(void) { *((uint64_t*) (i0_SP - 8)) = i0_BP; i0_BP = i0_SP - 0x8; i0_SP -= 0x100; i0_SP = i0_BP; i0_BP = *((uint64_t*) i0_SP); i0_SP += 0x8; return; } - Apply either -O3 or -Os option to gcc, the final object file gives the same results as follows: - test: 0: lea 0xfff8(%r15),%rcx 4: mov %r14,%rdx 7: mov %r15,%rax a: mov %r14,0xfff8(%r15) e: mov %rcx,%r14 11: mov %rcx,%r15 14: mov %rdx,%r14 17: mov %rax,%r15 1a: retq - Here we just try to emulate a function call. In the object file, there are apparently lots of redundant movs between registers. It seems to be a bug in gcc since we have already apply the maximum optimization level possible. Environment: On CentOS 5.10 (Linux 2.6.18 x86_64) using GCC 4.8.2 Using built-in specs. COLLECT_GCC=gcc4 COLLECT_LTO_WRAPPER=/usr/local/GNU/gcc-4.8.2/libexec/gcc/x86_64-unknown-linu x-gnu/4.8.2/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.8.2/configure --prefix=/usr/local/GNU/gcc-4.8.2 --enable-clocale=generic Thread model: posix gcc version 4.8.2 (GCC)
[Bug target/60480] New: gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60480 Bug ID: 60480 Summary: gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux. Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ganboing at gmail dot com gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux. Consider the following code: include stdint.h register uint64_t i0_BP __asm__ (r14); register uint64_t i0_SP __asm__ (r15); void test(void) { *((uint64_t*) (i0_SP - 8)) = i0_BP; i0_BP = i0_SP - 0x8; i0_SP -= 0x100; i0_SP = i0_BP; i0_BP = *((uint64_t*) i0_SP); i0_SP += 0x8; return; } Apply either ‘-O3’ or ‘-Os’ option to gcc, the final object file gives the same results as follows: test: 0: lea0xfff8(%r15),%rcx 4: mov%r14,%rdx 7: mov%r15,%rax a: mov%r14,0xfff8(%r15) e: mov%rcx,%r14 11: mov%rcx,%r15 14: mov%rdx,%r14 17: mov%rax,%r15 1a: retq Here we just try to emulate a frame establishment. In the object file, there are apparently lots of redundant ‘mov’s between registers. It seems to be a bug in gcc since we have already apply the maximum optimization level possible. Environment: On CentOS 5.10 (Linux 2.6.18 x86_64) using GCC 4.8.2 Using built-in specs. COLLECT_GCC=gcc4 COLLECT_LTO_WRAPPER=/usr/local/GNU/gcc-4.8.2/libexec/gcc/x86_64-unknown-linux-gnu/4.8.2/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.8.2/configure --prefix=/usr/local/GNU/gcc-4.8.2 --enable-clocale=generic Thread model: posix gcc version 4.8.2 (GCC)
[Bug target/60480] gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60480 --- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org --- This is due to x86 being a small register class target.
[Bug target/60480] gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60480 --- Comment #2 from ganboing at gmail dot com --- (In reply to Andrew Pinski from comment #1) This is due to x86 being a small register class target. The thing is that x86_64 has 16 GPRs, and register r12-r15 are preserved across function calls (SYSV ABI x86_64). The should be no reason that such opt. can't be done.
[Bug target/59726] [4.9 Regression] r206148 exposes broken vec_perm for big-endian aarch64; ICE at -O3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59726 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED CC||ramana at gcc dot gnu.org Resolution|--- |FIXED --- Comment #2 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- For 4.9 we just ended up disabling the vec_perm support for AArch64 BE, thanks to this patch - http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00988.html For the next stage1 Tejas had a patch that tried to fix these up properly in the backend. http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01334.html Therefore on that ground I think this should be closed as I don't see this testsuite failures on aarch64_be today. ramana
[Bug target/60459] Crash seen in _Unwind_VRS_Pop() for ARM platform
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60459 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2014-03-10 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- 4.2.1 is completely unsupported. There is not enough information here to try and reproduce the issue either - can you please follow instructions here http://gcc.gnu.org/bugs/ for reporting issues with the compiler ?
[Bug target/60298] [ARM/Thumb1] ICE caused by LRA for case pr54713-1.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60298 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added Keywords||ice-on-valid-code Status|UNCONFIRMED |RESOLVED CC||ramana at gcc dot gnu.org Resolution|--- |FIXED Target Milestone|--- |4.9.0 --- Comment #3 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- Fixed as per Vlad's last comment.
[Bug target/60109] __builtin_frame_address does not work as documented on ARM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60109 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #3 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- WONTFIX as there have been no further comments and based on the last 2 comments.
[Bug c++/60106] ICE in g++.dg/gomp/pr59150.C
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60106 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #1 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- Can you please add your configure flags here ?
[Bug c++/60106] ICE in g++.dg/gomp/pr59150.C
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60106 --- Comment #2 from Bernd Edlinger bernd.edlinger at hotmail dot de --- (In reply to Ramana Radhakrishnan from comment #1) Can you please add your configure flags here ? Sure, ../gcc-4.9-20140202/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf --enable-languages=c,c++,objc,obj-c++,fortran,ada,go --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard
[Bug c++/60474] [4.9 Regression] Crash in tree_class_check
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60474 Kai Tietz ktietz at gcc dot gnu.org changed: What|Removed |Added CC||ktietz at gcc dot gnu.org --- Comment #3 from Kai Tietz ktietz at gcc dot gnu.org --- Issue is that in double_int_ext_for_comb we try to get type-precision of a comb-type, where type is a NULL_TREE.
[Bug rtl-optimization/60473] optimization after shift sub-optimal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60473 --- Comment #1 from Martin marmoo1024 at gmail dot com --- After some checking I've found that the problem is with the binary OR operator. Addition doesn't have a problem but or does. Here are my results. unsigned long long **_rdtsc_64 () { unsigned long long h,l; asm volatile (rdtsc : =a (l), =d (h) ); return **; } x1_rdtsc_64(): rdtsc ; return l + h*(0x1LLU) salq$32, %rdx addq%rdx, %rax ret x2_rdtsc_64(): rdtsc ; return l | h*(0x1LLU) salq$32, %rdx orq %rax, %rdx movq%rdx, %rax ret x3_rdtsc_64(): rdtsc ; return l + (h32) salq$32, %rdx addq%rdx, %rax ret x4_rdtsc_64(): rdtsc ; return l | (h32) salq$32, %rdx orq %rax, %rdx movq%rdx, %rax ret
[Bug c++/60474] [4.9 Regression] Crash in tree_class_check
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60474 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #4 from Richard Biener rguenth at gcc dot gnu.org --- Mine.
[Bug middle-end/60419] [4.8/4.9 Regression] ICE Segmentation fault
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60419 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-03-10 CC||hubicka at gcc dot gnu.org, ||jason at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org --- The slsr issue is just a pilot error, I've mistakenly used ~ r205NNN compiler in that case, so it looks like an already fixed issue. Anyway, the ICE on ppc64 with the reduced testcase started with r208184 (thus I wonder about the 4.8 regression status), the problem is that getMeanVal function (method?) calls _ZThn8_NK4mrpt5utils16CPosePDFGaussian7getMeanERNS_5poses7CPose2DE thunk that has NULL node-callee (without -fPIC it ICEs in one spot, with -fPIC in another one). node-callees is set to non-NULL in: #0 cgraph_create_edge (caller=cgraph_node* 0x70f32148 _ZThn8_NK4mrpt5utils16CPosePDFGaussian7getMeanERNS_5poses7CPose2DE, callee=cgraph_node* 0x70f32000 *.LTHUNK0, call_stmt=gimple 0x0, count=0, freq=1000) at ../../gcc/cgraph.c:927 #1 0x008ffe81 in analyze_function ( node=cgraph_node* 0x70f32148 _ZThn8_NK4mrpt5utils16CPosePDFGaussian7getMeanERNS_5poses7CPose2DE) at ../../gcc/cgraphunit.c:611 #2 0x009010b4 in analyze_functions () at ../../gcc/cgraphunit.c:1017 #3 0x00904979 in finalize_compilation_unit () at ../../gcc/cgraphunit.c:2320 #4 0x0068b61d in cp_write_global_declarations () at ../../gcc/cp/decl2.c:4612 #5 0x00d0ee72 in compile_file () at ../../gcc/toplev.c:562 #6 0x00d11015 in do_compile () at ../../gcc/toplev.c:1914 #7 0x00d11180 in toplev_main (argc=8, argv=0x7fffe358) at ../../gcc/toplev.c:1990 #8 0x012c0464 in main (argc=8, argv=0x7fffe358) at ../../gcc/main.c:36 and cleared again in: #0 cgraph_node_remove_callees (node=cgraph_node* 0x70f32148 _ZThn8_NK4mrpt5utils16CPosePDFGaussian7getMeanERNS_5poses7CPose2DE) at ../../gcc/cgraph.c:1617 #1 0x00b2dc63 in symtab_remove_unreachable_nodes (before_inlining_p=false, file=0x0) at ../../gcc/ipa.c:493 #2 0x0124c93f in ipa_inline () at ../../gcc/ipa-inline.c:2060 #3 0x0124d385 in (anonymous namespace)::pass_ipa_inline::execute (this=0x1c73710) at ../../gcc/ipa-inline.c:2412 #4 0x00c299d6 in execute_one_pass (pass=opt_pass* 0x1c73710 inline(53)) at ../../gcc/passes.c:2229 #5 0x00c2a71b in execute_ipa_pass_list (pass=opt_pass* 0x1c73710 inline(53)) at ../../gcc/passes.c:2607 #6 0x009042ad in ipa_passes () at ../../gcc/cgraphunit.c:2084 #7 0x0090455e in compile () at ../../gcc/cgraphunit.c:2174 #8 0x00904988 in finalize_compilation_unit () at ../../gcc/cgraphunit.c:2329 #9 0x0068b61d in cp_write_global_declarations () at ../../gcc/cp/decl2.c:4612 #10 0x00d0ee72 in compile_file () at ../../gcc/toplev.c:562 #11 0x00d11015 in do_compile () at ../../gcc/toplev.c:1914 #12 0x00d11180 in toplev_main (argc=8, argv=0x7fffe358) at ../../gcc/toplev.c:1990 #13 0x012c0464 in main (argc=8, argv=0x7fffe358) at ../../gcc/main.c:36 At that point the thunk apparently has no callers. But somewhat later it gains one: #0 cgraph_set_edge_callee (e=0x7fffef50a8f0, n=cgraph_node* 0x70f32148 _ZThn8_NK4mrpt5utils16CPosePDFGaussian7getMeanERNS_5poses7CPose2DE) at ../../gcc/cgraph.c:1080 #1 0x008f74a8 in cgraph_make_edge_direct (edge=0x7fffef50a8f0, callee=cgraph_node* 0x70f32148 _ZThn8_NK4mrpt5utils16CPosePDFGaussian7getMeanERNS_5poses7CPose2DE) at ../../gcc/cgraph.c:1313 #2 0x00b1f7ae in ipa_make_edge_direct_to_target (ie=0x7fffef50a8f0, target=function_decl 0x70fc5a00 _ZThn8_NK4mrpt5utils16CPosePDFGaussian7getMeanERNS_5poses7CPose2DE) at ../../gcc/ipa-prop.c:2551 #3 0x00b20091 in try_make_edge_direct_virtual_call (ie=0x7fffef50a8f0, jfunc=0x7085b078, new_root_info=0x1e4cce0) at ../../gcc/ipa-prop.c:2799 #4 0x00b201e2 in update_indirect_edges_after_inlining (cs=0x7fffef9baf08, node=cgraph_node* 0x70ad58f8 getMeanVal, new_edges=0x0) at ../../gcc/ipa-prop.c:2852 #5 0x00b20476 in propagate_info_to_inlined_callees (cs=0x7fffef9baf08, node=cgraph_node* 0x70ad58f8 getMeanVal, new_edges=0x0) at ../../gcc/ipa-prop.c:2924 #6 0x00b20c3d in ipa_propagate_indirect_call_infos (cs=0x7fffef9baf08, new_edges=0x0) at ../../gcc/ipa-prop.c:3086 #7 0x0124e183 in inline_call (e=0x7fffef9baf08, update_original=true, new_edges=0x0, overall_size=0x0, update_overall_summary=true) at ../../gcc/ipa-inline-transform.c:277 #8 0x0124c6da in
[Bug middle-end/60429] [4.7/4.8/4.9 Regression] Miscompilation (aliasing) with -finline-functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #17 from Richard Biener rguenth at gcc dot gnu.org --- I'll have a look.
[Bug lto/60461] [4.9 Regression] LTO linking error at -Os (and above) on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60461 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords||lto, wrong-code CC||hubicka at gcc dot gnu.org Target Milestone|--- |4.9.0 Summary|LTO linking error at -Os|[4.9 Regression] LTO |(and above) on |linking error at -Os (and |x86_64-linux-gnu|above) on x86_64-linux-gnu
[Bug ipa/60457] [4.9 Regression] ICE in cgraph_get_node
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60457 --- Comment #3 from Richard Biener rguenth at gcc dot gnu.org --- Looks obvious to me.
[Bug middle-end/60478] convert_move assert failed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60478 --- Comment #2 from linzj manjian2006 at gmail dot com --- (In reply to Marek Polacek from comment #1) You've filed the same bug twice. *** This bug has been marked as a duplicate of bug 60479 *** 小手一抖,jj没有
[Bug c++/60474] [4.9 Regression] Crash in tree_class_check
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60474 --- Comment #5 from Richard Biener rguenth at gcc dot gnu.org --- type = signed_type_for (TREE_TYPE (e1)); tree_to_aff_combination (e1, type, aff_e1); tree_to_aff_combination (e2, type, aff_e2); signed_type_for (offset_type 0x76d9cc78) returns NULL_TREE. OFFSET_TYPE in the IL ... ugh. void fn1(Aint, int Layer::*, int) (struct A p1, Unknown tree: offset_type p2, int p3) { I have a patch.
[Bug target/60481] New: [4.9 Regression] Missing diagnostic ISO C++ forbids declaration of 'foo' with no type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60481 Bug ID: 60481 Summary: [4.9 Regression] Missing diagnostic ISO C++ forbids declaration of 'foo' with no type Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: d.g.gorbachev at gmail dot com Target: *-*-mingw32 $ cat foo.C foo() { return 0; } ^D $ i686-w64-mingw32-g++-4.9.0 -S foo.C $ i686-pc-linux-gnu-g++-4.9.0 -S foo.C foo.C:1:5: error: ISO C++ forbids declaration of 'foo' with no type [-fpermissive] foo() ^
[Bug debug/60438] [4.9 Regression] dwarf2cfi :2239 still assert,not the same cause as PR 59575
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60438 --- Comment #25 from Jakub Jelinek jakub at gcc dot gnu.org --- (In reply to linzj from comment #23) (In reply to Richard Henderson from comment #19) Created attachment 32311 [details] proposed patch Running full tests on this overnight, but it fixes the ICE. I try to remove the following hunk from you patch,it compiles Jakub's testcase right.Not run the full tests yet. diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c index 69fd5ea..5abec30 100644 --- a/gcc/combine-stack-adj.c +++ b/gcc/combine-stack-adj.c @@ -454,6 +454,14 @@ combine_stack_adjustments_for_block (basic_block bb) { HOST_WIDE_INT this_adjust = INTVAL (XEXP (src, 1)); + /* It's quite tricky to adjust the notes associated +with frame related insns. */ + if (RTX_FRAME_RELATED_P (insn)) + { + last2_sp_set = last_sp_set = NULL; + continue; + } + /* If we've not seen an adjustment previously, record it now and continue. */ if (! last_sp_set) Perhaps we can handle some most common cases of frame related insns (e.g. if both have REG_CFA_ADJUST_CFA notes, etc.), perhaps it would be worth it to run a bootstrap which would log when the above hunk prevented some merging and append both insns to some /tmp/ file across whole bootstrap, then we could see what is common enough to care about.
[Bug target/60481] [4.9 Regression] Missing diagnostic ISO C++ forbids declaration of 'foo' with no type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60481 --- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org --- I think you need -fno-ms-extensions, which may be on by default for mingw
[Bug middle-end/60418] [4.9 Regression] 435.gromacs in SPEC CPU 2006 is miscompiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418 --- Comment #18 from Jakub Jelinek jakub at gcc dot gnu.org --- Shouldn't we just prefer the original IL if possible? That is not SSA_NAME_VERSION, but not gimple_uid of the stmt definition either. If you have: _4 = something; _5 = somethingelse; _6 = somethingdifferent; _7 = _6 + _4; _8 = _7 + _5; then both SSA_NAME_VERSION and gimple_uid of def_stmt sorting would result in _9 = _4 + _5; _8 = _9 + _6; wouldn't it? But what do we gain by reassociating this (perhaps it can help value numbering and CSE if you have differently ordered sequences, but other than that this seems to be unnecessary reshufling and especially for floating point values and -ffast-math unnecessary source of extra ulps). So perhaps we want to sort by gimple uid of the first use among the insns we are looking at (and take into account also the operand number)?
[Bug middle-end/60429] [4.7/4.8/4.9 Regression] Miscompilation (aliasing) with -finline-functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #18 from Richard Biener rguenth at gcc dot gnu.org --- Seems to be a PTA issue: InsertionSort_pETEchase.29_82, points-to vars: { } InsertionSort_pETEchase.29_86, points-to non-local, points-to escaped, points-to vars: { } p1_155, points-to NULL, points-to vars: { } bb 30: InsertionSort_pETEchase.29_82-next = p1_84; p1_155-next = InsertionSort_pETEchase.28_85; InsertionSort_pETEchase.29_86 = InsertionSort_pETEchase.28_85-back; InsertionSort_pETEchase.29_86-next = p1_155; p1_155-back = InsertionSort_pETEchaseBackTMP; taking _155 as example # p1_155 = PHI _35(57), p1_58(60) _35, points-to NULL, points-to vars: { } p1_58, points-to vars: { } _35 = MEM[(struct _EdgeTableEntry * *)AET + 16B]; # p1_58 = PHI p1_84(59), p1_84(30), p1_87(34) p1_84 = p1_155-next; p1_87 = p1_155-next; so it's a cycle seeded only by MEM[(struct _EdgeTableEntry * *)AET + 16B]; _35 = { NULL } same as AET.128+64 _35 = AET.128+64 # p1_150 = PHI AET(47), p1_151(49) # p1_151 = PHI p1_19(47), p1_45(49) p1_45 = p1_151-next; fn3_tmp = p1_45; p1_151-next = p1_47; p1_151-back = p1_150; p1_150 = AET.0+96 *p1_150 + 128 = p1_151 Note that valgrind shows loads of errors (with the reduced testcase at least) that show invalid reads and writes even at -O0. So we may just optimistically optimize based on that undefined behavior. At least I can't see anything wrong with what PTA derives ...
[Bug c/55383] -Wcast-qual reports incorrect message
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55383 Magnus Reftel magnus.reftel at gmail dot com changed: What|Removed |Added CC||magnus.reftel at gmail dot com --- Comment #4 from Magnus Reftel magnus.reftel at gmail dot com --- Also affects 4.6, 4.8 and trunk as of version 96c7d4b1727c5f9ddcbb02fb69f727a0f2f3572e. 4.4 correctly prints just error: cast discards qualifiers from pointer target type. Did not check with version 4.5. Since 4.4 had it right, does this count as a 4.6/4.7/4.8/4.9 regression?
[Bug c/55383] -Wcast-qual reports incorrect message
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55383 --- Comment #5 from Manuel López-Ibáñez manu at gcc dot gnu.org --- (In reply to Magnus Reftel from comment #4) Also affects 4.6, 4.8 and trunk as of version 96c7d4b1727c5f9ddcbb02fb69f727a0f2f3572e. 4.4 correctly prints just error: cast discards qualifiers from pointer target type. Did not check with version 4.5. Since 4.4 had it right, does this count as a 4.6/4.7/4.8/4.9 regression? This just needs someone willing to test the patch in comment #1 and submit it. It is such a trivial patch that I cannot claim any authorship, so please adopt it and get it fixed. If you are fast enough, you may be able to sneak it in GCC 4.9. (The time it took you to do all those tests would have been better spent fixing the bug.)
[Bug c/55383] -Wcast-qual reports incorrect message
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55383 --- Comment #6 from Magnus Reftel magnus.reftel at gmail dot com --- Sorry, I'm not a GCC developer - just another user aflicted by the bug.
[Bug middle-end/55874] Incorrect warning location for uninitialized variable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55874 Manuel López-Ibáñez manu at gcc dot gnu.org changed: What|Removed |Added Keywords|patch | --- Comment #3 from Manuel López-Ibáñez manu at gcc dot gnu.org --- Sorry, changed keyword on the wrong bug!
[Bug fortran/60458] Error message on associate: deferred type parameter and requires either the pointer or allocatable attribute
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60458 --- Comment #2 from Antony Lewis antony at cosmologist dot info --- Here's a related example: module A implicit none Type T integer :: val = 2 contains final :: testfree end type contains subroutine testfree(this) Type(T) this print *,'freed' end subroutine subroutine Testf() associate(X = T()) print *, X%val end associate print *,'after scope' end subroutine Testf end module which gives print *, X%val 1 Error: Symbol 'x' at (1) has no IMPLICIT type (I was checking if finalization is called correctly, but didn't get that far) This code compiles in ifort.
[Bug debug/60438] [4.9 Regression] dwarf2cfi :2239 still assert,not the same cause as PR 59575
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60438 --- Comment #26 from linzj manjian2006 at gmail dot com --- (In reply to Jakub Jelinek from comment #25) Perhaps we can handle some most common cases of frame related insns (e.g. if both have REG_CFA_ADJUST_CFA notes, etc.), perhaps it would be worth it to run a bootstrap which would log when the above hunk prevented some merging and append both insns to some /tmp/ file across whole bootstrap, then we could see what is common enough to care about. Hi,Jakub,I remove the hunk, because I want some merging happen,not prevent it from.
[Bug c/55383] -Wcast-qual reports incorrect message
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55383 Manuel López-Ibáñez manu at gcc dot gnu.org changed: What|Removed |Added Keywords||patch --- Comment #7 from Manuel López-Ibáñez manu at gcc dot gnu.org --- (In reply to Magnus Reftel from comment #6) Sorry, I'm not a GCC developer - just another user aflicted by the bug. Everybody can be a GCC developer. You don't need special powers, just some free time and willing to be. For such a small patch, you don't need any copyright assignment. Just check out svn trunk, set up a bootstrap, test without the patch, apply the patch, bootstrap and test with the patch and compare the results. (Check the gccfarming script here: http://gcc.gnu.org/wiki/ManuelL%C3%B3pezIb%C3%A1%C3%B1ez for all the details). If the patch does not produce any new FAILs in the testsuite, submit to gcc-patches with a changelog and ask the reviewer to commit it once accepted. (You don't even need a powerful computer to do all this, just get an account on the compile farm: http://gcc.gnu.org/wiki/CompileFarm )
[Bug middle-end/60429] [4.7/4.8/4.9 Regression] Miscompilation (aliasing) with -finline-functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #19 from Markus Trippelsdorf trippels at gcc dot gnu.org --- Yes, looks like the reduced testcase is invalid and contains a few buffer overflows.
[Bug middle-end/60429] [4.7/4.8/4.9 Regression] Miscompilation (aliasing) with -finline-functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #20 from Richard Biener rguenth at gcc dot gnu.org --- As for what Andrew said, yes, the reinterpret_casts look bogus, you should really change typedef struct _POINTBLOCK { int data[200 * sizeof(QPoint)]; QPoint *pts; struct _POINTBLOCK *next; } POINTBLOCK; to typedef struct _POINTBLOCK { char data[200 * sizeof(QPoint) * sizeof (int)]; QPoint *pts; struct _POINTBLOCK *next; } POINTBLOCK; but that doesn't change the outcome of the testcase. The reduced testcase requiring QtCore is valgrind clean for me. The cause of the issue _is_ what tree PRE does to the function though. +Replaced AET.next with prephitmp_4 in pPrevAET_44 = AET.next; in PolygonRegion, with -O2 -fno-ipa-cp. Still most of the pointers are computed to point to noting by PTA. Function calls left in that function after inlining are operator delete[], free, operator new, qBadAlloc and malloc calls. --param max-fields-for-field-sensitive=0 fixes it as well, so it does point at a PTA issue. Still looking ...
[Bug debug/60438] [4.9 Regression] dwarf2cfi :2239 still assert,not the same cause as PR 59575
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60438 --- Comment #27 from Jakub Jelinek jakub at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #25) Perhaps we can handle some most common cases of frame related insns (e.g. if both have REG_CFA_ADJUST_CFA notes, etc.), perhaps it would be worth it to run a bootstrap which would log when the above hunk prevented some merging and append both insns to some /tmp/ file across whole bootstrap, then we could see what is common enough to care about. Wonder if we just shouldn't pass the other insn (the one we'd like to delete) to try_apply_stack_adjustment and if either of them is frame related insn, check harder to see if we can handle it or give up if we can't handle it. At least merging of a frame related stack adjustment and following non-frame related one (or vice versa?) is I think very common.
[Bug tree-optimization/60452] [4.8/4.9 Regression] wrong code at -O1 on x86_64-linux-gnu (affecting trunk and 4.8.x)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60452 --- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org --- (In reply to Eric Botcazou from comment #6) But even if I try: int a; __attribute__((noinline, noclone)) void foo (int *e) { asm volatile ( : : r (e) : memory); } int main () { int e[2] = { 0, 0 }, f = 0; if (a == 131072) f = e[a]; foo (e); return f; } where we have: (mem:SI (plus:DI (reg/f:DI 20 frame) (const_int 524272 [0x7fff0])) [2 e+524288 S4 A128]) instead and thus from MEM_EXPR we perhaps could find out that it is an out of bound access, we still always treat all frame based accesses (whatever the offset is) as non-trapping. So perhaps we need to handle known out of bound MEMs specially when we find that fact out (if we want to emit them into the RTL IL at all), one thing is expansion, another thing if say initially non-constant offset is later CSEd/forwprop etc. into constant out of bound offset. Thoughts? Again quite an artificial testcase... What about adding a comparison of the offset with the result of get_frame_size if the base register is SFP/HFP/SP? But what would be safe positive/negative offsets from frame_pointer? I mean, e.g. size of arguments is not included in the frame size, so size of arguments would need to be taken into account too, plus does the middle-end really know all the biases etc.?
[Bug fortran/60458] Error message on associate: deferred type parameter and requires either the pointer or allocatable attribute
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60458 --- Comment #3 from janus at gcc dot gnu.org --- (In reply to Antony Lewis from comment #2) Here's a related example: Though the test case may be loosely related to comment 0, the error is probably not so much related. Reduced version of comment 2: implicit none Type T integer :: val = 2 end type associate(X = T()) print *, X%val end associate end This compiles and runs cleanly with 4.6, but gives errors with 4.7, 4.8 and trunk. I think this should go into a separate PR.
[Bug middle-end/60482] New: Loop optimization regression
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60482 Bug ID: 60482 Summary: Loop optimization regression Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: yvan.roux at linaro dot org Created attachment 32323 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32323action=edit trunk.s Hi, I didn't had time to investigate further, but I want to raise quickly that the code bellow was optimized at r204283 by taking into account the trip count information of the loop and is not with the trunk (I spotted the issue on AArch64 and x86_64). code: typedef double adouble __attribute__ ((__aligned__(16))); double p1(adouble *x, int n) { double p1_ = 0.0; (!(n % 128) == 0) ? __builtin_unreachable() : 1 ; for (int i=0; in; i++) p1_ += x[i] ; return p1_ ; } compiled with flags : -Ofast -std=c99 x86_64 generated assembly at r204283: p1: .LFB0: .cfi_startproc testl %esi, %esi jle .L5 pxor%xmm1, %xmm1 shrl%esi xorl%eax, %eax .L4: movq%rax, %rdx addq$1, %rax salq$4, %rdx cmpl%eax, %esi addpd (%rdi,%rdx), %xmm1 ja .L4 movapd %xmm1, %xmm0 unpckhpd%xmm1, %xmm1 addsd %xmm1, %xmm0 ret .p2align 4,,10 .p2align 3 .L5: pxor%xmm0, %xmm0 ret .cfi_endproc X86_64 trunk generated assembly is attached. Thanks, Yvan
[Bug fortran/60458] Error message on associate: deferred type parameter and requires either the pointer or allocatable attribute
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60458 --- Comment #4 from Antony Lewis antony at cosmologist dot info --- OK, will do. (thought the underlying cause might be same issue with associate variables)
[Bug fortran/60483] New: associate error on valid code: no IMPLICIT type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60483 Bug ID: 60483 Summary: associate error on valid code: no IMPLICIT type Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: antony at cosmologist dot info module A implicit none Type T integer :: val = 2 contains final :: testfree end type contains subroutine testfree(this) Type(T) this print *,'freed' end subroutine subroutine Testf() associate(X = T()) print *, X%val end associate print *,'after scope' end subroutine Testf end module which gives print *, X%val 1 Error: Symbol 'x' at (1) has no IMPLICIT type (code checks if finalization is called correctly, but didn't get that far) This code compiles in ifort.
[Bug tree-optimization/60452] [4.8/4.9 Regression] wrong code at -O1 on x86_64-linux-gnu (affecting trunk and 4.8.x)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60452 --- Comment #8 from Eric Botcazou ebotcazou at gcc dot gnu.org --- But what would be safe positive/negative offsets from frame_pointer? I mean, e.g. size of arguments is not included in the frame size, so size of arguments would need to be taken into account too, plus does the middle-end really know all the biases etc.? No, that would be either conservative or not bullet-proof, at least if used alone. Maybe compare MEM_OFFSET and get_frame_size and return true if the former is larger than the latter. Why do we drop the MEM_EXPR if the DECL_RTL is a reg?
[Bug ada/60411] ADA bootstrap failure on ARM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60411 Bernd Edlinger bernd.edlinger at hotmail dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #6 from Bernd Edlinger bernd.edlinger at hotmail dot de --- OK, the cross build for arm-linux-gnueabihf succeeds again. So I will close this tracker now. Thanks, BUT if I look at these lines in gcc/ada/gcc-interace/Makefile.in: # ARM android ifeq ($(strip $(filter-out arm% linux-androideabi,$(target_cpu) $(target_os))),) LIBGNAT_TARGET_PAIRS = \ a-intnam.adsa-intnam-linux.ads \ s-inmaop.adbs-inmaop-posix.adb \ s-intman.adbs-intman-posix.adb \ s-linux.adss-linux.ads \ s-osinte.adbs-osinte-android.adb \ s-osinte.adss-osinte-android.ads \ s-osprim.adbs-osprim-posix.adb \ s-taprop.adbs-taprop-posix.adb \ s-taspri.adss-taspri-posix-noaltstack.ads \ s-tpopsp.adbs-tpopsp-posix-foreign.adb \ system.adssystem-linux-armel.ads \ $(DUMMY_SOCKETS_TARGET_PAIRS) TOOLS_TARGET_PAIRS = \ mlib-tgt-specific.adbmlib-tgt-specific-linux.adb \ indepsw.adbindepsw-gnu.adb GNATRTL_SOCKETS_OBJS = EXTRA_GNATRTL_TASKING_OBJS=s-linux.o EH_MECHANISM= THREADSLIB = GNATLIB_SHARED = gnatlib-shared-dual LIBRARY_VERSION := $(LIB_VERSION) endif So they use the same system.ads, which now links with a-exexpr-gcc.adb; Should'nt this target now also use EH_MECHANISM=-gcc or -arm?
[Bug c/60484] New: -fdump-rtl-expand and attribute optimize gives incorrect dump file path
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60484 Bug ID: 60484 Summary: -fdump-rtl-expand and attribute optimize gives incorrect dump file path Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: secondary.mail7865220 at gmail dot com Created attachment 32324 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32324action=edit test.c - Source to trigger the bug To trigger this bug, three conditions must be met: - At least one function must be annotated with __attribute__((optimize)). - The object file is placed in a sub-directory to where the source file is located. - The flag -fdump-rtl-expand is used. The path to the directory where the dump file is supposed to be saved is prepended the same number of times as there are functions with attribute optimize in the source C file. Compiler output: $ gcc -std=c99 -fdump-rtl-expand -o objs/test.o -c test.c test.c: In function ‘Optimized_1’: test.c:3:1: error: could not open dump file ‘objs/objs/objs/test.c.166r.expand’: No such file or directory Optimized_1(int arg) ^ test.c: In function ‘Optimized_2’: test.c:10:1: error: could not open dump file ‘objs/objs/objs/test.c.166r.expand’: No such file or directory Optimized_2(int arg) ^ test.c: In function ‘main’: test.c:15:5: error: could not open dump file ‘objs/objs/objs/test.c.166r.expand’: No such file or directory int main() ^ Compiler version: $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.8.2/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure Thread model: posix gcc version 4.8.2 (GCC) System type: $ uname -a Linux jf-linux 3.4.63-2.44-desktop #1 SMP PREEMPT Wed Oct 2 11:18:32 UTC 2013 (d91a619) x86_64 x86_64 x86_64 GNU/Linux
[Bug middle-end/60482] Loop optimization regression
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60482 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-03-10 CC||amker at gcc dot gnu.org, ||jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org --- Started to be optimized probably with r204255, is not optimized anymore again starting with r208165.
[Bug debug/60438] [4.9 Regression] dwarf2cfi :2239 still assert,not the same cause as PR 59575
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60438 --- Comment #28 from linzj manjian2006 at gmail dot com --- (In reply to Jakub Jelinek from comment #27) Wonder if we just shouldn't pass the other insn (the one we'd like to delete) to try_apply_stack_adjustment and if either of them is frame related insn, check harder to see if we can handle it or give up if we can't handle it. At least merging of a frame related stack adjustment and following non-frame related one (or vice versa?) is I think very common. I think the final solution will both a fulfillment of dwarf2cfi csa jump2. That is csa is able to combine the stack operations,jump2 is able to find the common insn in ends of blocks,and dwarf2cfi is able to get the correct input data.
[Bug middle-end/60429] [4.7/4.8/4.9 Regression] Miscompilation (aliasing) with -finline-functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60429 --- Comment #21 from Richard Biener rguenth at gcc dot gnu.org --- AFAIK I can understand the reduced testcase AET is never written to anything but the initial NULL pointers. Neither CerateETandAET nor loadAET do anything to the PolygonRegion local AET. I have a fix (bah, this function needs a LOT of TLC!) Index: gcc/tree-ssa-structalias.c === --- gcc/tree-ssa-structalias.c (revision 208448) +++ gcc/tree-ssa-structalias.c (working copy) @@ -3218,7 +3218,12 @@ get_constraint_for_component_ref (tree t { cexpr.var = curr-id; results-safe_push (cexpr); - if (address_p) + /* If we take the address and the field starts exactly +at the desired position that was all we need to add. */ + if (address_p + curr-offset == (unsigned HOST_WIDE_INT) bitpos + bitmaxsize != -1 + bitsize == bitmaxsize) break; } }
Re: [Bug ada/60411] ADA bootstrap failure on ARM
the cross build for arm-linux-gnueabihf succeeds again. Great. So they use the same system.ads, which now links with a-exexpr-gcc.adb; Should'nt this target now also use EH_MECHANISM=-gcc or -arm? Yes, android should also use EH_MECHANISM=-arm I'll make that change.
[Bug ada/60411] ADA bootstrap failure on ARM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60411 --- Comment #7 from charlet at adacore dot com charlet at adacore dot com --- the cross build for arm-linux-gnueabihf succeeds again. Great. So they use the same system.ads, which now links with a-exexpr-gcc.adb; Should'nt this target now also use EH_MECHANISM=-gcc or -arm? Yes, android should also use EH_MECHANISM=-arm I'll make that change.
[Bug middle-end/60418] [4.9 Regression] 435.gromacs in SPEC CPU 2006 is miscompiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418 --- Comment #19 from rguenther at suse dot de rguenther at suse dot de --- On Mon, 10 Mar 2014, jakub at gcc dot gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418 --- Comment #18 from Jakub Jelinek jakub at gcc dot gnu.org --- Shouldn't we just prefer the original IL if possible? That is not SSA_NAME_VERSION, but not gimple_uid of the stmt definition either. If you have: _4 = something; _5 = somethingelse; _6 = somethingdifferent; _7 = _6 + _4; _8 = _7 + _5; then both SSA_NAME_VERSION and gimple_uid of def_stmt sorting would result in _9 = _4 + _5; _8 = _9 + _6; wouldn't it? But what do we gain by reassociating this (perhaps it can help value numbering and CSE if you have differently ordered sequences, but other than that this seems to be unnecessary reshufling and especially for floating point values and -ffast-math unnecessary source of extra ulps). So perhaps we want to sort by gimple uid of the first use among the insns we are looking at (and take into account also the operand number)? Yes, it wants to canonicalize to get more CSE as followup. So sorting after SSA_NAME_VERSION (or its definition UID) does make sense ... Looking at the first use is also possible, but what _is_ the first use? _4 = something if (foo) _5 = _4 + 1; else _6 = _4 + 2;
[Bug c++/60474] [4.9 Regression] Crash in tree_class_check
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60474 --- Comment #6 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Mon Mar 10 13:27:16 2014 New Revision: 208451 URL: http://gcc.gnu.org/viewcvs?rev=208451root=gccview=rev Log: 2014-03-10 Richard Biener rguent...@suse.de PR middle-end/60474 * tree.c (signed_or_unsigned_type_for): Handle OFFSET_TYPEs. * g++.dg/torture/pr60474.C: New testcase. Added: trunk/gcc/testsuite/g++.dg/torture/pr60474.C Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree.c
[Bug c++/60474] [4.9 Regression] Crash in tree_class_check
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60474 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED CC||jakub at gcc dot gnu.org Resolution|--- |FIXED --- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org --- Fixed.
[Bug target/60481] [4.9 Regression] Missing diagnostic ISO C++ forbids declaration of 'foo' with no type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60481 --- Comment #2 from Dmitry Gorbachev d.g.gorbachev at gmail dot com --- Yes, it seems that it is on (there is an error with -fno-ms-extensions), but: $ i686-w64-mingw32-g++-4.9.0 -Q --help=c++ | grep ms-ext -fms-extensions [disabled]
[Bug tree-optimization/60485] field-sensitive points-to confused by pointer offsetting
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60485 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2014-03-10 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Richard Biener rguenth at gcc dot gnu.org --- Mine.
[Bug tree-optimization/60485] New: field-sensitive points-to confused by pointer offsetting
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60485 Bug ID: 60485 Summary: field-sensitive points-to confused by pointer offsetting Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org extern void abort (void); struct S { int *i[4]; int *p1; int *p2; int *p3; int *p4; }; int **b; int main() { int i = 1; struct S s; s.p3 = i; int **p; if (b) p = b; else p = s.i[2]; p += 4; if (!b) **p = 0; if (i != 0) abort (); return i; }
[Bug middle-end/60482] Loop optimization regression
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60482 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org --- Created attachment 32325 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32325action=edit gcc49-pr60482.patch Untested fix. Apparently the problem is that we add an ASSERT_EXPR right before the __builtin_unreachable, even when it obviously isn't needed there. The comment talks about pre-4.4 FOUND_IN_SUBGRAPH stuff, at least right now with live_on_edge that function returns false in this case.
[Bug ipa/60457] [4.9 Regression] ICE in cgraph_get_node
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60457 --- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org --- Author: jakub Date: Mon Mar 10 14:55:20 2014 New Revision: 208454 URL: http://gcc.gnu.org/viewcvs?rev=208454root=gccview=rev Log: PR ipa/60457 * ipa.c (symtab_remove_unreachable_nodes): Don't call cgraph_get_create_node on VAR_DECLs. * g++.dg/ipa/pr60457.C: New test. Added: trunk/gcc/testsuite/g++.dg/ipa/pr60457.C Modified: trunk/gcc/ChangeLog trunk/gcc/ipa.c trunk/gcc/testsuite/ChangeLog
[Bug libgcc/60464] [arm] ARM -mthumb version of libgcc contains ARM (non-thumb) code; not safe for thumb-only architectures
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60464 --- Comment #8 from Richard Earnshaw rearnsha at gcc dot gnu.org --- (In reply to Jeremy Cooper from comment #7) Is there a reason these were commented out? Is the armv7 multilib unstable? Volume of variants that have to be compiled at build time. Each enabled entry practically doubles the build time for the libraries.
[Bug ipa/60457] [4.9 Regression] ICE in cgraph_get_node
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60457 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org --- Fixed.
[Bug other/60486] New: [avr] missed optimization on detecting zero flag set
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60486 Bug ID: 60486 Summary: [avr] missed optimization on detecting zero flag set Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: darryl.piper at gmail dot com detection of a variable being decremented to reach zero missed optimization. int main(uint16_t, uint16_t ); int main(uint16_t x, uint16_t y) { uint16_t z = x; while (x y) { if ( --z == 0 ) return 1; x++; } return 0; } produces with gcc 4.8.0 and 4.8.1 and I expect 4.8.2 as well. compiled with -Os the code at 0x82 to 0x8a uses a compare against zero, when the subi and sbc, leave the zero flag set on a atmega8. 7a:9c 01 movwr18, r24 7c:68 17 cpr22, r24 7e:79 07 cpcr23, r25 80:38 f4 brcc.+14 ; 0x90 main+0x16 82:21 50 subir18, 0x01; 1 84:31 09 sbcr19, r1 86:21 15 cpr18, r1 88:31 05 cpcr19, r1 8a:29 f0 breq.+10 ; 0x96 main+0x1c 8c:01 96 adiwr24, 0x01; 1 8e:f6 cf rjmp.-20 ; 0x7c main+0x2 90:80 e0 ldir24, 0x00; 0 92:90 e0 ldir25, 0x00; 0 94:08 95 ret 96:81 e0 ldir24, 0x01; 1 98:90 e0 ldir25, 0x00; 0 9a:08 95 ret in gcc/config/avr/avr.md the code for (define_insn addmode3 no longer says it alters the zero of negative flag which the 4.7.2 branch did depending on which choice of add adc or sub sbc choice it used.
[Bug c/55383] -Wcast-qual reports incorrect message
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55383 Gerald Pfeifer gerald at pfeifer dot com changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #8 from Gerald Pfeifer gerald at pfeifer dot com --- I'll see what I can do.
[Bug rtl-optimization/60452] [4.8/4.9 Regression] wrong code at -O1 on x86_64-linux-gnu (affecting trunk and 4.8.x)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60452 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Component|tree-optimization |rtl-optimization --- Comment #9 from Jakub Jelinek jakub at gcc dot gnu.org --- expand_expr_real_1 for the ARRAY_REF in question does: /* If we have either an offset, a BLKmode result, or a reference outside the underlying object, we must force it to memory. Such a case can occur in Ada if we have unchecked conversion of an expression from a scalar type to an aggregate type or for an ARRAY_RANGE_REF whose type is BLKmode, or if we were passed a partially uninitialized object or a view-conversion to a larger size. */ must_force_mem = (offset || mode1 == BLKmode || bitpos + bitsize GET_MODE_BITSIZE (mode2)); ... /* Otherwise, if this is a constant or the object is not in memory and need be, put it there. */ else if (CONSTANT_P (op0) || (!MEM_P (op0) must_force_mem)) { tree nt = build_qualified_type (TREE_TYPE (tem), (TYPE_QUALS (TREE_TYPE (tem)) | TYPE_QUAL_CONST)); memloc = assign_temp (nt, 1, 1); emit_move_insn (memloc, op0); op0 = memloc; mem_attrs_from_type = true; } op0 is DECL_RTL of the array, (reg/v:DI 85 [ e ]), bitpos is 0x40, bitsize is 32, mode2 is DImode. Not sure if it is safe to set MEM_EXPR (etc. on assign_temp result, if it is, we could do that. Note that MEM_NOTRAP_P is set on it (I assume it is fine too, because we should consider it only when not out of bound access). In any case, as the #c3 testcase shows, even when we do have MEM_EXPR and could see that it is out of bound, we don't use that info at all.
[Bug c++/53492] [4.7/4.8/4.9 Regression] ICE in retrieve_specialization, at cp/pt.c:985
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53492 --- Comment #5 from Jason Merrill jason at gcc dot gnu.org --- Author: jason Date: Mon Mar 10 15:44:50 2014 New Revision: 208455 URL: http://gcc.gnu.org/viewcvs?rev=208455root=gccview=rev Log: PR c++/53492 * parser.c (cp_parser_class_head): Also check PRIMARY_TEMPLATE_P when deciding whether to call push_template_decl for a member class. * pt.c (push_template_decl_real): Return after wrong levels error. Added: trunk/gcc/testsuite/g++.dg/template/memtmpl4.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/parser.c trunk/gcc/cp/pt.c
[Bug middle-end/60418] [4.9 Regression] 435.gromacs in SPEC CPU 2006 is miscompiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418 --- Comment #20 from H.J. Lu hjl.tools at gmail dot com --- (In reply to Richard Biener from comment #13) Huh, adding a pre-header should _never_ do sth like that. Can you produce a small testcase that exhibits these kind of changes with adding/removing a preheader? copyprop2 pass removed a preheader and cunrolli pass added it back: bb 3: # n_213 = PHI 1(2) bb 8: # n_8 = PHI n_213(3), n_218(9) copyprop3 pass optimized it to bb 3: n_213 = 1; bb 4: # n_8 = PHI 1(3), n_218(7) Then the unused n_213 disappeared in reassoc1 pass and n_213 was put on FREE_SSANAMES.
[Bug testsuite/60487] New: FAIL: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-generate -D_PROFILE_GENERATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60487 Bug ID: 60487 Summary: FAIL: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-generate -D_PROFILE_GENERATE Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: danglin at gcc dot gnu.org Host: hppa2.0w-hp-hpux11.11 Target: hppa2.0w-hp-hpux11.11 Build: hppa2.0w-hp-hpux11.11 spawn /test/gnu/gcc/objdir/gcc/xgcc -B/test/gnu/gcc/objdir/gcc/ /test/gnu/gcc/gc c/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fno-diagnostics-sho w-caret -fdiagnostics-color=never /test/gnu/gcc/gcc/gcc/testsuite/gcc.dg/tree-pr of/crossmodule-indircall-1a.c -fprofile-generate -D_PROFILE_GENERATE -lm -o /tes t/gnu/gcc/objdir/gcc/testsuite/gcc/crossmodule-indircall-1a.x01 /usr/ccs/bin/ld: Duplicate symbol main in files /var/tmp//ccFgQFKi.o and /var/ tmp//ccQoAKFt.o /usr/ccs/bin/ld: Duplicate symbol global constructors keyed to 65535_1_main in files /var/tmp//ccFgQFKi.o and /var/tmp//ccQoAKFt.o /usr/ccs/bin/ld: Found 2 duplicate symbol(s) collect2: error: ld returned 1 exit status compiler exited with status 1 Test probability needs to check visibility.
[Bug c/60488] New: missing -Wmaybe-uninitialized on a conditional with goto
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60488 Bug ID: 60488 Summary: missing -Wmaybe-uninitialized on a conditional with goto Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: msebor at gmail dot com The -Wmaybe-uninitialized option is documented like so: For an automatic variable, if there exists a path from the function entry to a use of the variable that is initialized, but there exist some other paths for which the variable is not initialized, the compiler emits a warning if it cannot prove the uninitialized paths are not executed at run time. In the program below, when f(a) returns zero, the variable b is considered to have been initialized by the call to f(b) when it's used as the argument in the first call to g(b). However, when f(a) returns non-zero, the variable b is used uninitialized in the second call to g(b). Therefore, there exists a path through the function where b is used initialized as well as one where it's used uninitialized. Thus, GCC should issue a warning. It, however, does not. $ cat t.c gcc -O2 -Wuninitialized -Wmaybe-uninitialized -c -o/dev/null t.c int f (int**); void g (int*); int foo (void) { int *a, *b; if (f (a) || f (b)) goto end; g (a); g (b); return 0; end: g (b); return 1; }
[Bug c++/58678] [4.9 Regression] pykde4-4.11.2 link error (devirtualization too trigger happy)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58678 Jason Merrill jason at gcc dot gnu.org changed: What|Removed |Added Assignee|jason at gcc dot gnu.org |hubicka at gcc dot gnu.org --- Comment #30 from Jason Merrill jason at gcc dot gnu.org --- Honza was going to make some adjustments to my patch.
[Bug tree-optimization/59121] [4.8/4.9 Regression] endless loop with -O2 -floop-parallelize-all
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59121 --- Comment #14 from Mircea Namolaru mircea.namolaru at inria dot fr --- Confirmed. Start looking at it. This test also enters in an endless loop with the options -fgraphite-identiy -floop-nest-optimize -O2 -c.
[Bug middle-end/60418] [4.9 Regression] 435.gromacs in SPEC CPU 2006 is miscompiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418 --- Comment #21 from Jakub Jelinek jakub at gcc dot gnu.org --- Can you try if sorting on gimple_uid would help this or not? I think it would be something like: --- gcc/tree-ssa-reassoc.c.jj2014-02-19 06:59:35.0 +0100 +++ gcc/tree-ssa-reassoc.c2014-03-10 17:26:06.707683626 +0100 @@ -506,11 +506,17 @@ sort_by_operand_rank (const void *pa, co } /* Lastly, make sure the versions that are the same go next to each - other. We use SSA_NAME_VERSION because it's stable. */ + other. Prefer gimple_uid of def stmt, fall back to SSA_NAME_VERSION + if more stmts have the same uid. */ if ((oeb-rank - oea-rank == 0) TREE_CODE (oea-op) == SSA_NAME TREE_CODE (oeb-op) == SSA_NAME) { + unsigned int uida = gimple_uid (SSA_NAME_DEF_STMT (oea-op)); + unsigned int uidb = gimple_uid (SSA_NAME_DEF_STMT (oeb-op)); + if (uida uidb uida != uidb) +return uidb - uida; + if (SSA_NAME_VERSION (oeb-op) != SSA_NAME_VERSION (oea-op)) return SSA_NAME_VERSION (oeb-op) - SSA_NAME_VERSION (oea-op); else (make check RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} dg.exp=*reassoc* tree-ssa.exp=*reassoc*' with it still passes, haven't tested it more than that).
[Bug tree-optimization/59121] [4.8/4.9 Regression] endless loop with -O2 -floop-parallelize-all
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59121 --- Comment #15 from Jeffrey A. Law law at redhat dot com --- Mircea, thanks. I'm definitely looking forward to seeing Graphite in a better state! With you on board at INRIA and working on Graphite, I will not be calling for Graphite's removal after the 4.9 release. Thanks again, jeff
[Bug libstdc++/60489] New: Document which functions can be recursively reentered
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60489 Bug ID: 60489 Summary: Document which functions can be recursively reentered Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: documentation Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org The standard says: 17.6.5.8 Reentrancy [reentrancy] Except where explicitly specified in this standard, it is implementation-defined which functions in the Standard C ++ library may be recursively reentered. Our docs on implementation-defined properties (with the C++03 section number) say: [17.4.4.5] Non-reentrant functions are probably best discussed in the various sections on multithreading (see above). While that may be true, (1) the sections on multithreading are not above and (2) don't say anything about reentrancy. This affects whether, for example, an element being erased from a container during a call to clear() can call clear() on the container again, see http://stackoverflow.com/q/20755194/981959 (we probably *could* make that work if we wanted to, but it would require more work to support a very uncommon case). I think the simplest solution is to document that for our implementation no functions are reentrant unless specified otherwise, then specify otherwise later for particular functions.
[Bug middle-end/60418] [4.9 Regression] 435.gromacs in SPEC CPU 2006 is miscompiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418 --- Comment #22 from H.J. Lu hjl.tools at gmail dot com --- (In reply to Jakub Jelinek from comment #21) Can you try if sorting on gimple_uid would help this or not? I think it would be something like: Yes, it works.
[Bug tree-optimization/59025] [4.9 Regression] Revision 203979 causes failure in CPU2006 benchmark 435.gromacs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59025 --- Comment #9 from Jakub Jelinek jakub at gcc dot gnu.org --- Can you please try the http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418#c21 patch?
[Bug other/60486] [avr] missed optimization on detecting zero flag set
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60486 Georg-Johann Lay gjl at gcc dot gnu.org changed: What|Removed |Added CC||gjl at gcc dot gnu.org --- Comment #1 from Georg-Johann Lay gjl at gcc dot gnu.org --- Created attachment 32326 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32326action=edit C test case Here is a valid test case. Compiled with avr-gcc 4.8.2 $ avr-gcc pr60486.c -S -Os -mmcu=atmega8 -dp we get: pr60486: movw r18,r24 ; 5*movhi/1[length = 1] .L2: cp r22,r24 ; 21*cmphi/3[length = 2] cpc r23,r25 brsh .L7 ; 22branch[length = 1] subi r18,1 ; 13addhi3_clobber/2[length = 2] sbc r19,__zero_reg__ cp r18,__zero_reg__ ; 14*cmphi/2[length = 2] cpc r19,__zero_reg__ breq .L5 ; 15branch[length = 1] adiw r24,1 ; 17addhi3_clobber/1[length = 1] rjmp .L2 ; 55jump[length = 1] .L7: ldi r24,0 ; 7*movhi/2[length = 2] ldi r25,0 ret ; 49return[length = 1] .L5: ldi r24,lo8(1) ; 6*movhi/5[length = 2] ldi r25,0 ret ; 51return[length = 1] The superfluous insn is #14 (*cmphi). This worked with 4.7 with the output as follows: pr60486: movw r18,r24 ; 7*movhi/1[length = 1] rjmp .L2 ; 48jump[length = 1] .L4: subi r18,1 ; 15addhi3_clobber/2[length = 2] sbc r19,__zero_reg__ breq .L5 ; 17branch[length = 1] adiw r24,1 ; 19addhi3_clobber/1[length = 1] .L2: cp r22,r24 ; 23*cmphi/3[length = 2] cpc r23,r25 brlo .L4 ; 24branch[length = 1] ldi r18,0 ; 9*movhi/2[length = 2] ldi r19,0 rjmp .L3 ; 50jump[length = 1] .L5: ldi r18,lo8(1) ; 8*movhi/5[length = 2] ldi r19,0 .L3: movw r24,r18 ; 56*movhi/1[length = 1] ret ; 55return[length = 1]