[Bug fortran/40766] this fortran program is too slow
--- Comment #10 from ubizjak at gmail dot com 2009-07-16 06:56 --- (In reply to comment #6) Thus with the GLIBC (with AMD patches) or with the AMCL, one gets only a slowdown of 25%, which is still acceptable. Why the Intel routines are so slow on my AMD, I do not know. See [1], section 12.1, CPU dispatching in Intel compiler, on how to hack around this issue. [1] http://www.agner.org/optimize/optimizing_cpp.pdf -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40766
[Bug c++/40752] -Wconversion generates false warnings for operands not larger than target type
-- photon at seznam dot cz changed: What|Removed |Added Severity|enhancement |normal Summary|-Wconversion: do not warn |-Wconversion generates false |for operands not larger than|warnings for operands not |target type |larger than target type http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40752
[Bug fortran/40766] this fortran program is too slow
--- Comment #11 from ubizjak at gmail dot com 2009-07-16 07:16 --- (In reply to comment #6) Thus the question is really: Why are neither vmlsSinCos4 nor vmlsTan4 - nor for ACML vrs4_sincosf/vrsa_sincosf (vrs*_tan* does not exist) called? Because sincos returns _TWO_ values and the vectorizer does not yet support this. ASAP as the middle-end infrastructure is in place, we can stick vectorized sincos in ix86_veclib* functions. See also [1] and [2], sincos part. Perhaps you could motivate Richi to extend the vectorizer infrastructure ;) [1] http://software.intel.com/en-us/articles/implement-the-short-vector-math-library/ [2] http://developer.amd.com/cpu/Libraries/acml/onlinehelp/Documents/Vector.html#Vector -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40766
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #5 from zimmerma+gcc at loria dot fr 2009-07-16 07:52 --- Created an attachment (id=18203) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18203action=view) preprocessed version of the file mpn_exp.c from mpfr-2.4.1 Note that replacing line 74: MPN_ZERO (a, n - 1); by: { int n1 = n - 1; MPN_ZERO (a, n1); } fixes the problem, where MPN_ZERO is defined as: #define MPN_ZERO(dst, n) memset((dst), 0, (n)*BYTES_PER_MP_LIMB) and BYTES_PER_MP_LIMB is 4. If I write size_t n1 or unsigned int n1 above instead of int n1, the bug reappears. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #6 from mikpe at it dot uu dot se 2009-07-16 08:31 --- (In reply to comment #5) Created an attachment (id=18203) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18203action=view) [edit] preprocessed version of the file mpn_exp.c from mpfr-2.4.1 Note that replacing line 74: MPN_ZERO (a, n - 1); by: { int n1 = n - 1; MPN_ZERO (a, n1); } fixes the problem, where MPN_ZERO is defined as: #define MPN_ZERO(dst, n) memset((dst), 0, (n)*BYTES_PER_MP_LIMB) and BYTES_PER_MP_LIMB is 4. If I write size_t n1 or unsigned int n1 above instead of int n1, the bug reappears. Sounds a lot like PR39867 and PR40747 are hitting you. Can you grab those fixes, apply them to your 4.4.0, rebuild it, and test mpfr again? Or get the 4.4.1-RC and test that instead. I just finished building 4.3.4 and 4.4.0 on USIIIi/Solaris 9, and they built gmp-4.2.4 and mpfr-2.4.1 fine, with both passing make check. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug fortran/40766] this fortran program is too slow
--- Comment #12 from rguenth at gcc dot gnu dot org 2009-07-16 09:06 --- Actually the middle-end presents the vectorizer with a call to a complex function and REAL/IMAGPART exprs. I don't remember exactly which part confuses it, but certainly the mixed complex / real types do. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40766
[Bug tree-optimization/40770] New: Vectorization of complex types, vectorization of sincos missing
The following should be vectorized with proper -mveclib: float xf[1024]; float sf[1024]; float cf[1024]; void foo (void) { int i; for (i = 0; i 1024; ++i) { sf[i] = __builtin_sinf (xf[i]); cf[i] = __builtin_cosf (xf[i]); } } double xd[1024]; double sd[1024]; double cd[1024]; void bar (void) { int i; for (i = 0; i 1024; ++i) { sd[i] = __builtin_sin (xd[i]); cd[i] = __builtin_cos (xd[i]); } } -- Summary: Vectorization of complex types, vectorization of sincos missing Product: gcc Version: 4.5.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rguenth at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
-- ubizjak at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-07-16 09:43:14 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #1 from rguenth at gcc dot gnu dot org 2009-07-16 09:44 --- The middle-end presents the vectorizer with bb 3: # i_13 = PHI i_7(4), 0(2) # ivtmp.26_8 = PHI ivtmp.26_16(4), 1024(2) D.1623_3 = xd[i_13]; sincostmp.21_1 = __builtin_cexpi (D.1623_3); D.1624_4 = IMAGPART_EXPR sincostmp.21_1; sd[i_13] = D.1624_4; D.1625_6 = REALPART_EXPR sincostmp.21_1; cd[i_13] = D.1625_6; i_7 = i_13 + 1; ivtmp.26_16 = ivtmp.26_8 - 1; if (ivtmp.26_16 != 0) goto bb 4; else goto bb 5; which has first of all complex types (they should be recognized as V2DF with vectorization factor 1, thus SLP-able). For the float case bb 3: # i_13 = PHI i_7(4), 0(2) # ivtmp.6_8 = PHI ivtmp.6_16(4), 1024(2) D.1610_3 = xf[i_13]; sincostmp.1_1 = __builtin_cexpif (D.1610_3); D.1611_4 = IMAGPART_EXPR sincostmp.1_1; sf[i_13] = D.1611_4; D.1612_6 = REALPART_EXPR sincostmp.1_1; cf[i_13] = D.1612_6; i_7 = i_13 + 1; ivtmp.6_16 = ivtmp.6_8 - 1; if (ivtmp.6_16 != 0) goto bb 4; else goto bb 5; they should be V2SF, thus use V4SF and vectorization factor 2. Still use SLP probably. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug c++/40764] -O3 gives wrong behaviour, no opt. OK
--- Comment #4 from vielhaber at gmail dot com 2009-07-16 09:59 --- Created an attachment (id=18204) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18204action=view) g++ -version -- vielhaber at gmail dot com changed: What|Removed |Added Attachment #18198|0 |1 is obsolete|| Attachment #18199|0 |1 is obsolete|| Attachment #18200|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40764
[Bug c++/40764] -O3 gives wrong behaviour, no opt. OK
--- Comment #5 from vielhaber at gmail dot com 2009-07-16 10:01 --- It was an index out of range error. Now, -O0 and -O3 both work as expected. -- vielhaber at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40764
[Bug fortran/40766] this fortran program is too slow
-- ubizjak at gmail dot com changed: What|Removed |Added BugsThisDependsOn||40770 Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-07-16 10:06:11 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40766
[Bug fortran/40766] this fortran program is too slow
--- Comment #13 from burnus at gcc dot gnu dot org 2009-07-16 09:43 --- See PR 40770 for Vectorization of complex types, vectorization of sincos missing -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40766
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #7 from david dot kirkby at onetel dot net 2009-07-16 10:19 --- (In reply to comment #4) mpfr-2.4.1 compiles and tests Ok for me on an Ultra5 (USIIi) running sparc64-linux, with gmp-4.2.4 (compiled by gcc-4.3.4) and gcc 4.3.4, 4.4.0, and 4.4.1 20090630. I don't have a T2, but could possibly do some tests on USIIIi/Solaris 9. I believe the problem is likely to only be seen on the T2+ or similar processors. Someone noticed the library code in OpenSolaris is different on the T2+ processor to what it would be on a more common SPARC processor. I have built gcc 4.4.0 on Solaris 10 on a Sun Blade 2000 (UltraSPARC II processors) and have no problem with mpfr, but on the T2+ processors of the T5240 server, it does not work. I'll try a later snapshot, but any serious gcc developer would be welcome to an account on the machine where it fails. That does not mean anyone that just happens to be interested and fancies playing on a 16-core machine, but if you are a serious gcc developer, then I could give you an account. Dave -- david dot kirkby at onetel dot net changed: What|Removed |Added CC||david dot kirkby at onetel ||dot net http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #8 from david dot kirkby at onetel dot net 2009-07-16 10:24 --- (In reply to comment #4) Sounds a lot like PR39867 and PR40747 are hitting you. Can you grab those fixes, apply them to your 4.4.0, rebuild it, and test mpfr again? Or get the 4.4.1-RC and test that instead. I just finished building 4.3.4 and 4.4.0 on USIIIi/Solaris 9, and they built gmp-4.2.4 and mpfr-2.4.1 fine, with both passing make check. I can't see how it is similar to PR39867 and PR40747 only occurs at an optimisation of 1 or higher. This occurs with no optimisation at all. Dave -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #9 from jakub at gcc dot gnu dot org 2009-07-16 10:31 --- folding happens even at -O0 and both bugs are in the folder. So, please try ftp://sources.redhat.com/pub/gcc/snapshots/4.4.1-RC-20090715/ first. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #2 from irar at il dot ibm dot com 2009-07-16 12:29 --- pr40770.c:20: note: == examining statement: sincostmp.21_1 = __builtin_cexpi (D.1625_3); pr40770.c:20: note: get vectype for scalar type: complex double pr40770.c:20: note: not vectorized: unsupported data-type complex double make_vector_type returns NULL for this type. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #10 from david dot kirkby at onetel dot net 2009-07-16 12:32 --- (In reply to comment #9) folding happens even at -O0 and both bugs are in the folder. So, please try ftp://sources.redhat.com/pub/gcc/snapshots/4.4.1-RC-20090715/ first. I tried it. kir...@t2:[/tmp/kirkby/mpfr-2.4.1] $ gcc -v Using built-in specs. Target: sparc-sun-solaris2.10 Configured with: ../gcc-4.4.1-RC-20090715/configure --prefix=/usr/local/gcc-4.4.1-RC-20090715-sun-linker --with-as=/usr/ccs/bin/as --without-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++,fortran --with-mpfr-include=/usr/local/include --with-mpfr-lib=/usr/local/lib --with-gmp-include=/usr/local/include --with-gmp-lib=/usr/local/lib Thread model: posix gcc version 4.4.1 20090715 (prerelease) (GCC) but again got 20 test failures on mpfr 2.4.1 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #3 from rguenther at suse dot de 2009-07-16 13:05 --- Subject: Re: Vectorization of complex types, vectorization of sincos missing On Thu, 16 Jul 2009, irar at il dot ibm dot com wrote: --- Comment #2 from irar at il dot ibm dot com 2009-07-16 12:29 --- pr40770.c:20: note: == examining statement: sincostmp.21_1 = __builtin_cexpi (D.1625_3); pr40770.c:20: note: get vectype for scalar type: complex double pr40770.c:20: note: not vectorized: unsupported data-type complex double make_vector_type returns NULL for this type. Yes - there is no vector type for complex double. But the vectorizer could query for a vector type for the complex component type (double) and divide the vector element count by 2 (for complex) to get the vectorization factor which would be 1 here. Should SLP the be possible for that loop? Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #4 from burnus at gcc dot gnu dot org 2009-07-16 13:32 --- (In reply to comment #3) Yes - there is no vector type for complex double. But the vectorizer could query for a vector type for the complex component type (double) and divide the vector element count by 2 (for complex) to get the vectorization factor which would be 1 here. I do not know much about this, but wouldn't that fail if one wants to vectorize true complex functions such as ccosf (assuming that they are in principle vectorizable)? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #5 from rguenther at suse dot de 2009-07-16 13:57 --- Subject: Re: Vectorization of complex types, vectorization of sincos missing On Thu, 16 Jul 2009, burnus at gcc dot gnu dot org wrote: --- Comment #4 from burnus at gcc dot gnu dot org 2009-07-16 13:32 --- (In reply to comment #3) Yes - there is no vector type for complex double. But the vectorizer could query for a vector type for the complex component type (double) and divide the vector element count by 2 (for complex) to get the vectorization factor which would be 1 here. I do not know much about this, but wouldn't that fail if one wants to vectorize true complex functions such as ccosf (assuming that they are in principle vectorizable)? Well, for ccosf we would have a vectorization factor of 2 left for V4SF. Of course this assumes that we present the vectorizer with a vectorized ccosf with the signature v4sf (*)(v4sf). Or we would need to introduce complex vector modes - which I'd rather avoid. Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #11 from jakub at gcc dot gnu dot org 2009-07-16 14:07 --- You haven't mentioned what options you compiled this file with. So, assuming -O2, I see: add %i4, -1, %l5! n,, tmp186 sethi %hi(1073740800), %o2!, tmp189 sll %l5, 2, %l5 ! tmp186,, D.4491 or %o2, 1023, %o2 ! tmp189,, tmp188 st %g1, [%i0+%l5] !,* D.4491 add %i4, %o2, %o2 ! n, tmp188, tmp187 mov %i0, %o0! a, sll %o2, 2, %o2 ! tmp187,, callmemset, 0 !, mov0, %o1 !, for this memset call, which looks correct to me. The st %g1, [%i0+%l5] line stores to %i0 a[n-1] and memset is called with memset (a, 0, (n + 0x3fffU) 2); So, if this doesn't work (and you see the same), you hit a bug in Solaris memset implementation, which doesn't handle properly length with garbage in upper 32-bits, guess it could use brz,pn %o2, do_nothing or something similar, which is fine for 64-bit code, but certainly not for 32-bit code. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug tree-optimization/40771] New: generated code is ~25% slower when autovectorization is enabled
For the following code: --- uint8_t data[16]; static __attribute__((noinline)) void test(unsigned i) { unsigned j; for (j = 0; j 16; j++) data[j] = (i + j) 8; } --- code generated with -O3 -ftree-vectorize is ~25% slower than with -O3 -fno-tree-vectorize for gcc 4.4 and 4.5. 4.3 and older don't vectorize this code. Command line: gcc tst2a.c -o tst2.o -O3 -march=k8 -fno-tree-vectorize gcc tst2a.c -o tst2.o -O3 -march=k8 -ftree-vectorize (using -m32 -fomit-frame-pointer has no significant effect on performance) Tested versions (average time in ticks, 124 loops): 3.4.6 (gentoo) - (66 ticks) very slow, probably doesn't unroll the loop (I haven't looked at the code) 4.1.2 - 4.3.3 (gentoo) - (20 ticks) doesn't autovectorize even when -ftree-vectorize is specified 4.4.0 (gentoo) - (20 without vectorizing, 30 with) 4.5.0 (r149701) - (19 ticks / 24 ticks) non-vectorized code is faster by 1 tick with -march=k8 than with -march=barcelona (even when my arch is barcelona) (I am reporting this only against 4.5.0 since I don't have vanilla 4.4.0 and older) Tests were repeated several times, run with highest priority and with affinity set to one core. CPU is AMD Phenom (4 cores, Barcelona) running at fixed 1400MHz. Attached is code including whole test code. -- Summary: generated code is ~25% slower when autovectorization is enabled Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: zsojka at seznam dot cz GCC host triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40771
[Bug tree-optimization/40771] generated code is ~25% slower when autovectorization is enabled
--- Comment #1 from zsojka at seznam dot cz 2009-07-16 15:06 --- Created an attachment (id=18205) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18205action=view) preprocessed source Includes contents of headers stdint.h, stdio.h -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40771
[Bug tree-optimization/40771] generated code is ~25% slower when autovectorization is enabled
--- Comment #2 from zsojka at seznam dot cz 2009-07-16 15:06 --- # ./gcc -v Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --prefix=/mnt/svn/gcc-trunk/build/ Thread model: posix gcc version 4.5.0 20090714 (experimental) (GCC) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40771
[Bug rtl-optimization/40772] New: generating rendundant moves from second byte of 32b/64b register
For the following code: uint8_t data[16]; static __attribute__((noinline)) void test(unsigned i) { unsigned j; for (j = 0; j 16; j++) data[j] = ((i + j) 0xFF00) 8; } generated asm looks like (using -fno-tree-vectorize because of pr40771 ) # ./gcc tst2b.c -o tst2.o -O3 -march=k8 -fno-tree-vectorize test: .LFB11: .cfi_startproc movq%rdi, %rdx movzbl %dh, %eax movb%al, data(%rip) leal1(%rdi), %eax movzbl %ah, %eax movb%al, data+1(%rip) leal2(%rdi), %eax movzbl %ah, %eax movb%al, data+2(%rip) leal3(%rdi), %eax movzbl %ah, %eax movb%al, data+3(%rip) . When movzbl %ah, %eax ; movb %al, data+1(%rip) is replaced by movb %ah, data+1(%rip) , code is faster. (other issue may be using lea even for -march=pentium4 which would probably prefer add eax,1, but I can't verify that) # ./gcc -v Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --prefix=/mnt/svn/gcc-trunk/build/ Thread model: posix gcc version 4.5.0 20090714 (experimental) (GCC) CPU is AMD Phenom (4 cores, Barcelona) running at fixed 1400MHz. gcc's generated code runs in 19 ticks in average, code with movzbl ; mov al replaced by mov ah runs in 16 ticks. Attached is whole test code. -- Summary: generating rendundant moves from second byte of 32b/64b register Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: zsojka at seznam dot cz GCC host triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40772
[Bug rtl-optimization/40772] generating rendundant moves from second byte of 32b/64b register
--- Comment #1 from zsojka at seznam dot cz 2009-07-16 15:34 --- Created an attachment (id=18206) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18206action=view) preprocessed source of test code Runs 1 24 iterations, prints average time in ticks. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40772
[Bug rtl-optimization/40772] generating rendundant moves from second byte of 32b/64b register
--- Comment #2 from zsojka at seznam dot cz 2009-07-16 15:42 --- When data[j] = ((i + j) 0xFF00) 8; is replaced by data[j] = (i + j) 8; generated asm uses shr eax, 8 instead of movzx eax, ah, and runs in 19 ticks in average. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40772
[Bug fortran/40773] New: gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
On linux x86, with -m32 or -m64, gfortran 4.4.0 exhausts over 2GB of swap space compiling the attached source file at -O3.In contrast 4.1.2 compiles unexceptionally. This might be expected behavior of 4.4.0 due to some new laborious optimization, which would terminate eventually with enough RAM/swap, or it might just be spinning its wheels pointlessly. To reproduce from attached files, substantially condensed from SPEC MPI 2007 127.wrf2: /bin/rm -f *.o *.mod gfortran -c bigmod.f90 gfortran -c shift_domain_em.f90 -O3 f951: out of memory allocating 239668920 bytes after a total of 3520262144 bytes Compile times: 4.1.2 4.4.0 -O2: real1m19.827s real2m58.945s user1m14.541s user2m44.450s sys 0m1.668ssys 0m3.140s -O3: real1m30.645s real 10m5.166s user1m25.613s user 8m39.388s sys 0m1.492ssys 0m7.632s Verbose expansion of failing compile: /net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local/bin/gfortran --verbose -c shift_domain_em.f90 -O3 Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: /net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/gcc-4.4.0/configure --prefix=/net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local --exec-prefix=/net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local --enable-languages=c++,fortran --with-gmp-include=/net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local/include --with-mpfr-include=/net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local/include --with-gmp-lib=/net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local/lib64 --with-mpfr-lib=/net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local/lib64 Thread model: posix gcc version 4.4.0 (GCC) COLLECT_GCC_OPTIONS='-v' '-c' '-O3' '-mtune=generic' /net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local/libexec/gcc/x86_64-unknown-linux-gnu/4.4.0/f951 shift_domain_em.f90 -quiet -dumpbase shift_domain_em.f90 -mtune=generic -auxbase shift_domain_em -O3 -version -fintrinsic-modules-path /net/hs-usca-01.sfbay/export/home1/12/dgh/gnu-linux/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/finclude -o /tmp/ccBfbVaD.s GNU Fortran (GCC) version 4.4.0 (x86_64-unknown-linux-gnu) compiled by GNU C version 4.4.0, GMP version 4.3.1, MPFR version 2.4.1. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 f951: out of memory allocating 239668920 bytes after a total of 3520262144 bytes Verbose expansion of correspond successful compile: /usr/bin/gfortran --verbose -c shift_domain_em.f90 -O3 Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux Thread model: posix gcc version 4.1.2 20070626 (Red Hat 4.1.2-14) /usr/libexec/gcc/x86_64-redhat-linux/4.1.2/f951 shift_domain_em.f90 -quiet -dumpbase shift_domain_em.f90 -mtune=generic -auxbase shift_domain_em -O3 -version -I /usr/lib/gcc/x86_64-redhat-linux/4.1.2/finclude -o /tmp/cc0MGeZy.s GNU F95 version 4.1.2 20070626 (Red Hat 4.1.2-14) (x86_64-redhat-linux) compiled by GNU C version 4.1.2 20070626 (Red Hat 4.1.2-14). GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 as -V -Qy -o shift_domain_em.o /tmp/cc0MGeZy.s GNU assembler version 2.17.50.0.6-5.el5 (x86_64-redhat-linux) using BFD version 2.17.50.0.6-5.el5 20061020 -- Summary: gfortran 4.4.0 -O3 exhausts memory in 127.wrf2 Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: dh458 at oakapple dot net http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug fortran/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #1 from dh458 at oakapple dot net 2009-07-16 16:16 --- Created an attachment (id=18207) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18207action=view) modules definition file, bzip2'd cat bigmod.f90.bz2 | bunzip2 bigmod.f90 compressed to avoid bugzilla size limitations -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug fortran/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #2 from dh458 at oakapple dot net 2009-07-16 16:17 --- Created an attachment (id=18208) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18208action=view) failing source file that uses modules in bigmod.f90 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug ada/40775] New: ICE in find_valid_class, at reload.c:701
/home/joel/test-gcc/b-gcc2-arm/./gcc/xgcc -B/home/joel/test-gcc/b-gcc2-arm/./gcc/ -nostdinc -B/home/joel/test-gcc/b-gcc2-arm/arm-rtems4.10/newlib/ -isystem /home/joel/test-gcc/b-gcc2-arm/arm-rtems4.10/newlib/targ-include -isystem /home/joel/test-gcc/gcc-svn/newlib/libc/include -B/home/joel/test-gcc/install/arm-rtems4.10/bin/ -B/home/joel/test-gcc/install/arm-rtems4.10/lib/ -isystem /home/joel/test-gcc/install/arm-rtems4.10/include -isystem /home/joel/test-gcc/install/arm-rtems4.10/sys-include-c -g -O2 -mthumb -W -Wall -gnatpg -mthumb a-nllcef.ads -o a-nllcef.o /home/joel/test-gcc/b-gcc2-arm/./gcc/xgcc -B/home/joel/test-gcc/b-gcc2-arm/./gcc/ -nostdinc -B/home/joel/test-gcc/b-gcc2-arm/arm-rtems4.10/newlib/ -isystem /home/joel/test-gcc/b-gcc2-arm/arm-rtems4.10/newlib/targ-include -isystem /home/joel/test-gcc/gcc-svn/newlib/libc/include -B/home/joel/test-gcc/install/arm-rtems4.10/bin/ -B/home/joel/test-gcc/install/arm-rtems4.10/lib/ -isystem /home/joel/test-gcc/install/arm-rtems4.10/include -isystem /home/joel/test-gcc/install/arm-rtems4.10/sys-include-c -g -O2 -mthumb -W -Wall -gnatpg -mthumb a-nllcty.ads -o a-nllcty.o +===GNAT BUG DETECTED==+ | 4.5.0 20090710 (experimental) [trunk revision 149493] (arm-unknown-rtems4.10) GCC error:| | in find_valid_class, at reload.c:701 | | Error detected around a-ngcefu.adb:115:8 | | Please submit a bug report; see http://gcc.gnu.org/bugs.html.| | Use a subject line meaningful to you and us to track the bug.| | Include the entire contents of this bug box in the report. | | Include the exact gcc or gnatmake command that you entered. | | Also include sources listed below in gnatchop format | | (concatenated together with no headers between files). | +==+ -- Summary: ICE in find_valid_class, at reload.c:701 Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ada AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: joel at gcc dot gnu dot org GCC target triplet: arm-rtems4.10 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40775
[Bug c/40776] ICE in gen_add2_insn, at optabs.c:4720
--- Comment #1 from joel at gcc dot gnu dot org 2009-07-16 17:02 --- Created an attachment (id=18209) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18209action=view) Test case (preprocessed) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40776
[Bug c/40776] ICE in gen_add2_insn, at optabs.c:4720
--- Comment #2 from joel at gcc dot gnu dot org 2009-07-16 17:03 --- /home/joel/test-gcc/b-gcc1-m32c/./gcc/xgcc -B/home/joel/test-gcc/b-gcc1-m32c/./gcc/ -c j.c -mcpu=m32cm -O0 Works.. ICE at -O1 and -O2 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40776
[Bug ada/40777] New: compile error on gcc-interface/targtyps.c
gcc -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common -DHAVE_CONFIG_H -I.. -I. -Iada -I/home/joel/test-gcc/gcc-svn/gcc -I/home/joel/test-gcc/gcc-svn/gcc/ada -I/home/joel/test-gcc/gcc-svn/gcc/../include -I/home/joel/test-gcc/gcc-svn/gcc/../libcpp/include -I/home/joel/test-gcc/gcc-svn/gcc/../libdecnumber -I/home/joel/test-gcc/gcc-svn/gcc/../libdecnumber/dpd -I../libdecnumber /home/joel/test-gcc/gcc-svn/gcc/ada/gcc-interface/utils.c -o ada/utils.o /home/joel/test-gcc/gcc-svn/gcc/ada/gcc-interface/targtyps.c: In function get_target_double_scalar_alignment: /home/joel/test-gcc/gcc-svn/gcc/ada/gcc-interface/targtyps.c:241:32: error: TARGET_64BIT undeclared (first use in this function) /home/joel/test-gcc/gcc-svn/gcc/ada/gcc-interface/targtyps.c:241:32: error: (Each undeclared identifier is reported only once /home/joel/test-gcc/gcc-svn/gcc/ada/gcc-interface/targtyps.c:241:32: error: for each function it appears in.) -- Summary: compile error on gcc-interface/targtyps.c Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ada AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: joel at gcc dot gnu dot org GCC build triplet: sh-rtems4.10 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40777
[Bug target/39182] ICE in gen_add2_insn, at optabs.c:4884
--- Comment #6 from pinskia at gcc dot gnu dot org 2009-07-16 17:08 --- *** Bug 40776 has been marked as a duplicate of this bug. *** -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39182
[Bug target/40776] ICE in gen_add2_insn, at optabs.c:4720
--- Comment #3 from pinskia at gcc dot gnu dot org 2009-07-16 17:08 --- *** This bug has been marked as a duplicate of 39182 *** -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40776
[Bug fortran/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #3 from kargl at gcc dot gnu dot org 2009-07-16 17:22 --- Does it compile with either -O or -O2? gfortran 4.4.0 may be trying to do more inlining, which can consume more memory for temporary arrays. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #6 from irar at il dot ibm dot com 2009-07-16 17:31 --- (In reply to comment #3) make_vector_type returns NULL for this type. Yes - there is no vector type for complex double. But the vectorizer could query for a vector type for the complex component type (double) and divide the vector element count by 2 (for complex) to get the vectorization factor which would be 1 here. I see. Should SLP the be possible for that loop? Not with the current implementation - SLP needs strided stores to start. Here the stores are not even adjacent. I think, it would be better to vectorize this loop with regular loop-based vectorization to avoid permutations. I'll take a better look on Sunday. Ira Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug fortran/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #4 from dh458 at oakapple dot net 2009-07-16 18:24 --- Compiles OK at -O2, but more slowly than with 4.1.2: 4.1.2 4.4.0 -O2: real1m19.827s real2m58.945s -O3: real1m30.645s real 10m5.166s Even if -O3 ultimately generates correct code in some suitably large configuration, it may be that some optimizer algorithms could stand tuning to avoid time-consuming but ultimately unprofitable blind alleys. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug target/40735] [4.3/4.4 regression] memory hog compiling big functions with -fPIE
--- Comment #10 from mikpe at it dot uu dot se 2009-07-16 18:33 --- More memory usage numbers on this test case: With 4.4.1-RC-20090715: i686 peaks at 616M, powerpc at 799M, and arm at 1211M. With 4.5.0-20090709: i686 peaks at 530M, powerpc at 707M, and arm at 933M. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40735
[Bug libmudflap/40778] New: [4.5 Regression] Mudflap instrumentation missing in cloned function.
Cloned functions appear not to be properly instrumented by mudflap e.g alloca is not instrumented. This is responsible for the recent regressions in libmudflap testsuite. (pass45-frag.c,fail31-frag.c) -- Summary: [4.5 Regression] Mudflap instrumentation missing in cloned function. Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libmudflap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: tjruwase at google dot com GCC host triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40778
[Bug libmudflap/40778] [4.5 Regression] Mudflap instrumentation missing in cloned function.
--- Comment #1 from tjruwase at google dot com 2009-07-16 19:24 --- The problem is that mudflap avoids instrumenting synthetic functions, i.e DECL_ARTIFICIAL(decl) . Since cloned functions are synthetic functions, they are not instrumented by the 2nd mudflap pass. A possible fix would be for mudflap to determine that a synthetic function is actually a clone of a non synthetic function. Unfortunately it is not obvious to me how to obtain this information, since the relevant field is cleared immediately after use. For example, cgraph_node(fndecl)-clone_of which points to the cgraph_node of the original function is cleared once the clone is materialized in cgraph_materialize_clone() and save_inline_function_body(). Is there a way to identified cloned functions after materialization ?. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40778
[Bug fortran/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #5 from kargl at gcc dot gnu dot org 2009-07-16 19:55 --- (In reply to comment #4) Compiles OK at -O2, but more slowly than with 4.1.2: 4.1.2 4.4.0 -O2: real1m19.827s real2m58.945s -O3: real1m30.645s real 10m5.166s Even if -O3 ultimately generates correct code in some suitably large configuration, it may be that some optimizer algorithms could stand tuning to avoid time-consuming but ultimately unprofitable blind alleys. Hmmm, something is definitely admiss here. After 22 minutes of compiling with -O3 on my system, f951 'appears' stuck with only 450 MB of used memory. I attached ktrace to the process and found 50649 f951 CALL mmap(0,0x30,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0x,0) 50649 f951 RET mmap 500170752/0x21dd0 50649 f951 CALL madvise(0x22b179000,0x83000,MADV_FREE) 50649 f951 RET madvise 0 50649 f951 CALL madvise(0x22b002000,0x79000,MADV_FREE) 50649 f951 RET madvise 0 50649 f951 CALL madvise(0x21e0f8000,0x8000,MADV_FREE) 50649 f951 RET madvise 0 50649 f951 CALL madvise(0x21e002000,0xde000,MADV_FREE) 50649 f951 RET madvise 0 50649 f951 CALL munmap(0x21dd0,0x30) 50649 f951 RET munmap 0 repeated over and over and over again. It appears on my system the f951 is looping over some garbage collection, but never reaching a termination point. :( -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug fortran/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #6 from kargl at gcc dot gnu dot org 2009-07-16 20:29 --- (In reply to comment #5) (In reply to comment #4) Compiles OK at -O2, but more slowly than with 4.1.2: 4.1.2 4.4.0 -O2: real1m19.827s real2m58.945s -O3: real1m30.645s real 10m5.166s Even if -O3 ultimately generates correct code in some suitably large configuration, it may be that some optimizer algorithms could stand tuning to avoid time-consuming but ultimately unprofitable blind alleys. Hmmm, something is definitely admiss here. After 22 minutes of compiling with -O3 on my system, f951 'appears' stuck with only 450 MB of used memory. I attached ktrace to the process and found After 26 minutes, I suddenly see in top(1) 50879 sgk 1 -200 7346M 6724M swread 1 26:21 1.51% f951 so -O3 is causing a massive increase in the amount of memory required. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #12 from marc dot glisse at normalesup dot org 2009-07-16 20:34 --- (In reply to comment #11) for this memset call, which looks correct to me. The st %g1, [%i0+%l5] line stores to %i0 a[n-1] and memset is called with memset (a, 0, (n + 0x3fffU) 2); So, if this doesn't work (and you see the same), you hit a bug in Solaris memset implementation, which doesn't handle properly length with garbage in upper 32-bits, guess it could use brz,pn %o2, do_nothing or something similar, which is fine for 64-bit code, but certainly not for 32-bit code. The sun4v implementation in opensolaris looks fine (but may not have been backported to solaris 10). The following one on the other hand seems to use the same brnz for 32 and 64 bit code: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/sparc_hwcap1/common/gen/memset.s#88 It would be good to know which implementation is used... -- marc dot glisse at normalesup dot org changed: What|Removed |Added CC||marc dot glisse at ||normalesup dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug libstdc++/37907] [c++0x] support for std::is_standard_layout
--- Comment #3 from jason at gcc dot gnu dot org 2009-07-16 20:36 --- Subject: Bug 37907 Author: jason Date: Thu Jul 16 20:36:10 2009 New Revision: 149721 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=149721 Log: PR libstdc++/37907 Support std::is_standard_layout and std::is_trivial traits, change POD to C++0x version (except for layout). * gcc/c-common.c (c_common_reswords): Add __is_standard_layout and __is_trivial. * gcc/c-common.h (enum rid): Add RID_IS_STD_LAYOUT and RID_IS_TRIVIAL. * gcc/cp/cp-tree.h (enum cp_trait_kind): Add CPTK_IS_STD_LAYOUT, CPTK_IS_TRIVIAL. (struct lang_type_class): Add non_std_layout. (CLASSTYPE_NON_STD_LAYOUT): New. * gcc/cp/class.c (check_bases): Set it. (check_field_decls): Likewise. (check_bases_and_members): Likewise. * gcc/cp/parser.c (cp_parser_primary_expression): Handle RID_IS_STD_LAYOUT, RID_IS_TRIVIAL. (cp_parser_trait_expr): Likewise. * gcc/cp/semantics.c (trait_expr_value): Handle CPTK_IS_STD_LAYOUT, CPTK_IS_TRIVIAL. (finish_trait_expr): Likewise. * gcc/cp/tree.c (scalarish_type_p, trivial_type_p, std_layout_type_p): New. (pod_type_p): Use them. * gcc/cp/typeck.c (build_class_member_access_expr): Check CLASSTYPE_NON_STD_LAYOUT rather than CLASSTYPE_NON_POD_P. * libstdc++-v3/include/std/type_traits: Add is_standard_layout, is_trivial. Added: trunk/gcc/doc/implement-cxx.texi trunk/gcc/testsuite/g++.dg/cpp0x/std-layout1.C trunk/gcc/testsuite/g++.dg/cpp0x/trivial1.C Modified: trunk/gcc/ChangeLog trunk/gcc/Makefile.in trunk/gcc/c-common.c trunk/gcc/c-common.h trunk/gcc/cp/ChangeLog trunk/gcc/cp/call.c trunk/gcc/cp/class.c trunk/gcc/cp/cp-tree.h trunk/gcc/cp/cxx-pretty-print.c trunk/gcc/cp/decl.c trunk/gcc/cp/init.c trunk/gcc/cp/parser.c trunk/gcc/cp/semantics.c trunk/gcc/cp/tree.c trunk/gcc/cp/typeck.c trunk/gcc/doc/gcc.texi trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/g++.dg/ext/has_nothrow_assign.C trunk/gcc/testsuite/g++.dg/ext/has_nothrow_copy-1.C trunk/gcc/testsuite/g++.dg/ext/has_trivial_assign.C trunk/gcc/testsuite/g++.dg/ext/has_trivial_copy.C trunk/gcc/testsuite/g++.dg/ext/is_pod.C trunk/gcc/testsuite/g++.dg/other/offsetof3.C trunk/gcc/testsuite/g++.dg/overload/ellipsis1.C trunk/gcc/testsuite/g++.dg/warn/var-args1.C trunk/gcc/testsuite/g++.old-deja/g++.brendan/crash63.C trunk/gcc/testsuite/g++.old-deja/g++.brendan/crash64.C trunk/gcc/testsuite/g++.old-deja/g++.brendan/overload8.C trunk/gcc/testsuite/g++.old-deja/g++.other/vaarg3.C trunk/gcc/testsuite/g++.old-deja/g++.pt/vaarg3.C trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/std/type_traits trunk/libstdc++-v3/testsuite/20_util/make_signed/requirements/typedefs_neg.cc trunk/libstdc++-v3/testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37907
[Bug target/40411] -std=c99 does not enable c99 mode in Solaris C library
--- Comment #13 from heydowns at borg dot com 2009-07-16 21:11 --- Created an attachment (id=18210) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18210action=view) Fix updated for gcc 4.4.0, link xpg6 for c++, and link xpg4 for gnu* Updated patch against gcc 4.4.0. Also add xpg6 for c++ and xpg4 for gnu* as discussed above (this is now easily modified by changing the spec if someone who knows better can say what gnu* should do). -- heydowns at borg dot com changed: What|Removed |Added Attachment #18121|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40411
[Bug libobjc/39465] libobjc does not find classes of DLLs
--- Comment #14 from js-gcc at webkeks dot org 2009-07-16 21:16 --- Any comments? This is still very annoying and completely killing the ability to have plugins/bundles on win32. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39465
[Bug libstdc++/40779] basic_string::_M_disjunct corner case
--- Comment #1 from jkt at mailsnare dot net 2009-07-16 21:23 --- Created an attachment (id=18211) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18211action=view) patch for basic_string::_M_disjunct corner case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40779
[Bug libstdc++/40779] basic_string::_M_disjunct corner case
--- Comment #2 from jkt at mailsnare dot net 2009-07-16 21:27 --- correction: _M_disjunct should return false because the source and target are conjunct (overlap). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40779
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #13 from david dot kirkby at onetel dot net 2009-07-16 21:29 --- (In reply to comment #11) You haven't mentioned what options you compiled this file with. So, assuming -O2, I see: add %i4, -1, %l5! n,, tmp186 sethi %hi(1073740800), %o2!, tmp189 sll %l5, 2, %l5 ! tmp186,, D.4491 or %o2, 1023, %o2 ! tmp189,, tmp188 st %g1, [%i0+%l5] !,* D.4491 add %i4, %o2, %o2 ! n, tmp188, tmp187 mov %i0, %o0! a, sll %o2, 2, %o2 ! tmp187,, callmemset, 0 !, mov0, %o1 !, for this memset call, which looks correct to me. The st %g1, [%i0+%l5] line stores to %i0 a[n-1] and memset is called with memset (a, 0, (n + 0x3fffU) 2); So, if this doesn't work (and you see the same), you hit a bug in Solaris memset implementation, which doesn't handle properly length with garbage in upper 32-bits, guess it could use brz,pn %o2, do_nothing or something similar, which is fine for 64-bit code, but certainly not for 32-bit code. I should add we are using the Sun assembler and linker, not the GNU ones. I don't know whether that would effect the output. If so, can you tell me how to generate the assembler code? Or as I say, you can have an account. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug libstdc++/40779] New: basic_string::_M_disjunct corner case
i found a corner case, definitely odd but legal, in basic_string::_M_disjunct. consider the following: string s1(abcd); string s2(s1); // make it shared s1.append(s.data() - 1, 5); _M_disjunct should return true but, because it does not check the length, fails. i created a patch for this but see no where to attach it. -- Summary: basic_string::_M_disjunct corner case Product: gcc Version: 4.3.1 Status: UNCONFIRMED Severity: minor Priority: P3 Component: libstdc++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: jkt at mailsnare dot net GCC host triplet: x86_64-suse-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40779
[Bug c++/36856] [c++0x] __is_pod() fails for some pod types
--- Comment #8 from paolo dot carlini at oracle dot com 2009-07-16 21:46 --- Jason, I think it's the right time to revisit this PR: after your patch the testcase passes unconditionally. I'm just wondering if it would make sense to have a different semantics for __is_pod depending on the -std switch or not... -- paolo dot carlini at oracle dot com changed: What|Removed |Added CC||jason at gcc dot gnu dot org AssignedTo|paolo dot carlini at oracle |jason at gcc dot gnu dot org |dot com | Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36856
[Bug libstdc++/40779] basic_string::_M_disjunct corner case
--- Comment #3 from paolo dot carlini at oracle dot com 2009-07-16 22:01 --- s1.append(s.data() - 1, 5); What's s? If, as I believe, it's a std::basic_string - s1, s2, whatever - on which you are calling the data member function, the line is definitely illegal, because s.data() - 1 doesn't point to a char belonging to the string. Actually, what's so special about -1? If that line were legal you could pass to append *any* address, meaningless. -- paolo dot carlini at oracle dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40779
[Bug c++/36856] [c++0x] __is_pod() fails for some pod types
--- Comment #9 from paolo dot carlini at oracle dot com 2009-07-16 22:06 --- ... probably not, if you ask me. We briefly discussed the issue today, in relation to the builtins of the same name as provided by other front-ends. We never tried implementing the exact C++03 semantics: the very idea od the __is_* builtins started in GCC with the goal of providing the exact semantics for some of the C++0x type_traits, impossible without compiler support. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36856
[Bug c++/36856] [c++0x] __is_pod() fails for some pod types
--- Comment #10 from jason at gcc dot gnu dot org 2009-07-16 22:09 --- Fixed for 4.5.0. -- jason at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED Target Milestone|--- |4.5.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36856
[Bug libstdc++/37907] [c++0x] support for std::is_standard_layout
--- Comment #4 from jason at gcc dot gnu dot org 2009-07-16 22:11 --- Fixed for 4.5.0. -- jason at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37907
[Bug c/40435] [4.5 regression] Revision 148442 caused many regressions on trunk
--- Comment #9 from manu at gcc dot gnu dot org 2009-07-16 22:30 --- Subject: Bug 40435 Author: manu Date: Thu Jul 16 22:29:52 2009 New Revision: 149722 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=149722 Log: 2009-07-17 Aldy Hernandez al...@redhat.com Manuel López-Ibáñez m...@gcc.gnu.org PR 40435 * tree-complex.c, tree-loop-distribution.c, tree.c, tree.h, builtins.c, fold-const.c, omp-low.c, cgraphunit.c, tree-ssa-ccp.c, tree-ssa-dom.c, gimple-low.c, expr.c, tree-ssa-ifcombine.c, c-decl.c, stor-layout.c, tree-if-conv.c, c-typeck.c, gimplify.c, calls.c, tree-sra.c, tree-mudflap.c, tree-ssa-copy.c, tree-ssa-forwprop.c, c-convert.c, c-omp.c, varasm.c, tree-inline.c, c-common.c, c-common.h, gimple.c, tree-switch-conversion.c, gimple.h, tree-cfg.c, c-parser.c, convert.c: Add location argument to fold_{unary,binary,ternary}, fold_build[123], build_call_expr, build_size_arg, build_fold_addr_expr, build_call_array, non_lvalue, size_diffop, fold_build1_initializer, fold_build2_initializer, fold_build3_initializer, fold_build_call_array, fold_build_call_array_initializer, fold_single_bit_test, omit_one_operand, omit_two_operands, invert_truthvalue, fold_truth_not_expr, build_fold_indirect_ref, fold_indirect_ref, combine_comparisons, fold_builtin_*, fold_call_expr, build_range_check, maybe_fold_offset_to_address, round_up, round_down. objc/ * objc-act.c: Add location argument to all calls to build_fold_addr_expr. testsuite/ * gcc.dg/pr36902.c: Add column info. * g++.dg/gcov/gcov-2.C: Change count for definition. cp/ * typeck.c, init.c, class.c, method.c, rtti.c, except.c, error.c, tree.c, cp-gimplify.c, cxx-pretty-print.c, pt.c, semantics.c, call.c, cvt.c, mangle.c: Add location argument to fold_{unary,binary,ternary}, fold_build[123], build_call_expr, build_size_arg, build_fold_addr_expr, build_call_array, non_lvalue, size_diffop, fold_build1_initializer, fold_build2_initializer, fold_build3_initializer, fold_build_call_array, fold_build_call_array_initializer, fold_single_bit_test, omit_one_operand, omit_two_operands, invert_truthvalue, fold_truth_not_expr, build_fold_indirect_ref, fold_indirect_ref, combine_comparisons, fold_builtin_*, fold_call_expr, build_range_check, maybe_fold_offset_to_address, round_up, round_down. fortran/ * trans-expr.c, trans-array.c, trans-openmp.c, trans-stmt.c, trans.c, trans-io.c, trans-decl.c, trans-intrinsic.c: Add location argument to fold_{unary,binary,ternary}, fold_build[123], build_call_expr, build_size_arg, build_fold_addr_expr, build_call_array, non_lvalue, size_diffop, fold_build1_initializer, fold_build2_initializer, fold_build3_initializer, fold_build_call_array, fold_build_call_array_initializer, fold_single_bit_test, omit_one_operand, omit_two_operands, invert_truthvalue, fold_truth_not_expr, build_fold_indirect_ref, fold_indirect_ref, combine_comparisons, fold_builtin_*, fold_call_expr, build_range_check, maybe_fold_offset_to_address, round_up, round_down. Modified: trunk/gcc/ChangeLog trunk/gcc/builtins.c trunk/gcc/c-common.c trunk/gcc/c-common.h trunk/gcc/c-convert.c trunk/gcc/c-decl.c trunk/gcc/c-omp.c trunk/gcc/c-parser.c trunk/gcc/c-typeck.c trunk/gcc/calls.c trunk/gcc/cgraphunit.c trunk/gcc/convert.c trunk/gcc/cp/ChangeLog trunk/gcc/cp/call.c trunk/gcc/cp/class.c trunk/gcc/cp/cp-gimplify.c trunk/gcc/cp/cvt.c trunk/gcc/cp/cxx-pretty-print.c trunk/gcc/cp/error.c trunk/gcc/cp/except.c trunk/gcc/cp/init.c trunk/gcc/cp/mangle.c trunk/gcc/cp/method.c trunk/gcc/cp/pt.c trunk/gcc/cp/rtti.c trunk/gcc/cp/semantics.c trunk/gcc/cp/tree.c trunk/gcc/cp/typeck.c trunk/gcc/expr.c trunk/gcc/fold-const.c trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/trans-array.c trunk/gcc/fortran/trans-decl.c trunk/gcc/fortran/trans-expr.c trunk/gcc/fortran/trans-intrinsic.c trunk/gcc/fortran/trans-io.c trunk/gcc/fortran/trans-openmp.c trunk/gcc/fortran/trans-stmt.c trunk/gcc/fortran/trans.c trunk/gcc/gimple-low.c trunk/gcc/gimple.c trunk/gcc/gimple.h trunk/gcc/gimplify.c trunk/gcc/objc/ChangeLog trunk/gcc/objc/objc-act.c trunk/gcc/omp-low.c trunk/gcc/stor-layout.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/g++.dg/gcov/gcov-2.C trunk/gcc/testsuite/gcc.dg/pr36902.c trunk/gcc/tree-cfg.c trunk/gcc/tree-complex.c trunk/gcc/tree-if-conv.c trunk/gcc/tree-inline.c trunk/gcc/tree-loop-distribution.c trunk/gcc/tree-mudflap.c
[Bug c/40435] [4.5 regression] Revision 148442 caused many regressions on trunk
--- Comment #10 from manu at gcc dot gnu dot org 2009-07-16 22:38 --- FIXED. -- manu at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40435
[Bug fortran/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #7 from kargl at gcc dot gnu dot org 2009-07-16 22:47 --- Further investigation suggests that -O3 is unusable with Fortran code when the size of the code in a single file or module becomes very large. Specifically, gcc.info states: '-O3' turns on all optimizations specified by '-O2' and also turns on the '-finline-functions', '-funswitch-loops', '-fpredictive-commoning', '-fgcse-after-reload' and '-ftree-vectorize' options. On my system with OPT=-O2 -pipe -march=native, I get % time gfc4x -c $OPT shift_domain_em.f90 292.91 real 270.35 user20.92 sys % time gfc4x -c $OPT -finline-functions shift_domain_em.f90 297.52 real 275.26 user21.39 sys % time gfc4x -c $OPT -fpredictive-commoning shift_domain_em.f90 294.75 real 271.22 user21.21 sys % time gfc4x -c $OPT -fgcse-after-reload shift_domain_em.f90 295.13 real 270.59 user21.51 sys % time gfc4x -c $OPT -funswitch-loops shift_domain_em.f90 630.73 real 601.11 user24.99 sys % time gfc4x -c $OPT -ftree-vectorize shift_domain_em.f90 880.61 real 610.32 user56.16 sys It is also noteworthy that the first 5 command lines require about 2 GB of memory to compile the code. The last command line required more than 6 GB of memory, and on my system this leads to swapping. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug c++/40780] New: [4.4/4.5 Regression] ICE in gimplify_conversion
template class T1, typename T2, typename T3 struct A { typedef T2 (T1::*m) (T3); A (m) {} }; struct B; struct C { void foo (B *); }; typedef A C, void, B * D; typedef void (C::*E) (B *); struct F; typedef void (C::*G) (F); D d ((E) (G) C::foo); ICEs with -m32: rh511229.ii: In function 'void __static_initialization_and_destruction_0(int, int)': rh511229.ii:16:22: internal compiler error: tree check: expected class 'expression', have 'constant' (ptrmem_cst) in gimplify_conversion, at gimplify.c:1831 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. The ICE started between r140129 and r140150. -- Summary: [4.4/4.5 Regression] ICE in gimplify_conversion Product: gcc Version: 4.4.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: jakub at gcc dot gnu dot org GCC target triplet: i686-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40780
[Bug c++/40780] [4.4/4.5 Regression] ICE in gimplify_conversion
--- Comment #1 from jakub at gcc dot gnu dot org 2009-07-17 00:08 --- r140145 in particular. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40780
[Bug tree-optimization/40773] gfortran 4.4.0 -O3 exhausts memory in 127.wrf2
--- Comment #8 from kargl at gcc dot gnu dot org 2009-07-17 00:28 --- This appears to be more than a fortran issue. I've changed the component to tree-optimization out lack of a better choice. -- kargl at gcc dot gnu dot org changed: What|Removed |Added Component|fortran |tree-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40773
[Bug bootstrap/40781] New: [4.5 Regression] Revision 149722 failed to bootstrap
On Linux/ia64, revision 149722: http://gcc.gnu.org/ml/gcc-cvs/2009-07/msg00603.html failed to bootstrap: cc1: warnings being treated as errors ../../src-trunk/gcc/builtins.c: In function 'expand_builtin_memcmp': ../../src-trunk/gcc/builtins.c:4171:14: error: unused variable 'loc' ../../src-trunk/gcc/builtins.c: In function 'expand_builtin_strncmp': ../../src-trunk/gcc/builtins.c:4436:14: error: unused variable 'loc' make[6]: *** [builtins.o] Error 1 -- Summary: [4.5 Regression] Revision 149722 failed to bootstrap Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hjl dot tools at gmail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40781
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #14 from zimmerma+gcc at loria dot fr 2009-07-17 00:57 --- You haven't mentioned what options you compiled this file with. the problem appears both with -O0, -O1 and -O2. Paul -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug bootstrap/40781] [4.5 Regression] Revision 149722 failed to bootstrap
--- Comment #1 from hjl at gcc dot gnu dot org 2009-07-17 01:04 --- Subject: Bug 40781 Author: hjl Date: Fri Jul 17 01:03:55 2009 New Revision: 149733 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=149733 Log: 2009-07-16 H.J. Lu hongjiu...@intel.com PR bootstrap/40781 * builtins.c (expand_builtin_memcmp): Use loc instead of EXPR_LOCATION (exp). (expand_builtin_strncmp): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/builtins.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40781
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #15 from david dot kirkby at onetel dot net 2009-07-17 03:21 --- (In reply to comment #14) You haven't mentioned what options you compiled this file with. the problem appears both with -O0, -O1 and -O2. Paul Also worth noting is that this builds fine with some versions of gcc gcc 4.1.1 OK gcc 4.2.1 OK gcc 4.2.4 OK I could build gcc 4.3.0, but it would never install properly, so I have not tested on gcc 4.3.0. gcc 4.3.1 MPFRfails 20 tests gcc 4.3.3 MPFRfails 20 tests gcc 4.4.0 MPFRfails 20 tests. gcc-4.4.1-RC-20090715 MPFR fails 20 tests -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
[Bug c/40757] gcc 4.4.0 miscompiles mpfr-2.4.1
--- Comment #16 from david dot kirkby at onetel dot net 2009-07-17 04:11 --- (In reply to comment #0) See http://websympa.loria.fr/wwsympa/arc/mpfr/2009-07/msg00031.html and the following discussion. This was on t2.math.washington.edu with /usr/local/gcc-4.4.0-sun-linker/bin/gcc: zimme...@t2:/tmp/mpfr-2.4.1$ /usr/local/gcc-4.4.0-sun-linker/bin/gcc -v Using built-in specs. Target: sparc-sun-solaris2.10 Configured with: /home/kirkby/gcc-4.4.0/configure CC=/usr/sfw/bin/gcc --prefix=/usr/local/gcc-4.4.0-sun-linker --without-gnu-as --without-gnu-ld --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --enable-languages=c,c++,fortran --with-mpfr-lib=/usr/local/lib --with-mpfr-include=/usr/local/include --with-gmp-include=/usr/local/include --with-gmp-lib=/usr/local/lib --with-libiconv-prefix=/usr/lib/iconv Thread model: posix gcc version 4.4.0 (GCC) It would be useful if we had an exact statement of the problem, in terms of memset fails for ... or whatever. I'd ask a few Sun people to have a look here, and comment, but it's unclear precisely what is believed to be the issue. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757